Evaluation of Message Missing Failures in Flexray-Based Networks with Star Topology
Total Page:16
File Type:pdf, Size:1020Kb
Evaluation of Message Missing Failures in FlexRay-based Networks with Star Topology Abstract FlexRay communication protocol [10]. The FlexRay allows the sharing of the bus among event-triggered This paper evaluates the error propagation and its and time-triggered messages, thus offering the effects in message missing in a FlexRay-based network advantages of both protocols. It is reported that the with star topology. The evaluation is based on about FlexRay will very likely become the de-facto standard 35680 bit-flip fault injections inside different parts of for in-vehicle communications [5] [11]. The FlexRay the FlexRay communication controller. To do this, a defines a communication cycle (bus cycle) as the FlexRay communication controller is modeled by combination of a time-triggered (or static) window, an Verilog HDL at the behavioral level and is exploited to event-triggered (or dynamic) window, a symbol setup a FlexRay-based network composed of four window and a network idle time (NIT) window. The nodes. The results of fault injection show that about time-triggered window is similar to TTP, and employs 39% of faults lead to the message missing failures. The a time-division multiple-access (TDMA) mechanism. clock synchronization process and the controller host The event-triggered window of the FlexRay protocol is interface of the FlexRay were the most sensitive to the similar to Byteflight protocol and uses a flexible message missing failure. The coding and decoding unit TDMA (FTDMA) bus access method. The symbol of the FlexRay was the least sensitive to this failure. window is a communication period in which a symbol can be transmitted on the network. The NIT window is a communication-free period that specifies the end of 1. Introduction each communication cycle. The importance of safety in critical distributed Safety in distributed systems such as automotive applications signals to pay specific attention to the systems and avionics is of decisive importance due to reliability of communication protocols. One way to system failures which may threat human life. In a evaluate the reliability of communication protocols is distributed system, each node consists of three parts by fault injection to assess the vulnerability of such [1]: 1) I/O part, 2) host part, and 3) communication protocols. The fault injection techniques can be controller. Among these three parts, the classified into two main categories [12]: 1) hardware- communication controller has a key role in the based fault injection [13], and 2) software-based fault distributed systems operation. injection [14]. The latter can in turn be divided into In general, communication activities can be triggered software-implemented fault injection (SWIFI) and either dynamically, in response to an event (event- simulation-based fault injection [14]. In simulation- triggered), or statically, at predetermined moments in based fault injection, faults are injected into the time (time-triggered). Examples of time-triggered simulation model of circuits using HDL languages [14] protocols are the SAFEbus [2], SPIDER [3], and Time- or other languages such as C++ [15]. Triggered Protocol (TTP) [4]. The main drawback of In [16], a simulation-based fault injection has been the time-triggered protocols is their lack of flexibility used for the assessment of message missings in the [5]. Examples of event-triggered protocols are the CAN protocol. Effects of masquerade failures have Byteflight [6] introduced by BMW Company for been investigated using a simulation-based fault automotive applications, CAN [7], LonWorks [8] and injection in the CAN protocol [17]. Evaluation of Profibus [9]. The main drawback of the event-triggered TTP/C communication controller by heavy-ion fault protocols is their lack of predictability. A large injection (hardware-based fault injection) has been consortium of automotive manufacturers and suppliers performed in [18]. The purpose of the experiments in has proposed a hybrid type of protocol, namely, the that paper was to validate the fail silence property of the TTP/C by injecting faults in a single node. The 2. FlexRay Protocol Structure relationship between the number of nodes in a cluster and the slightly-off-specification (SOS) failures has The FlexRay protocol controller consists of six been assessed using heavy-ion fault injection [19]. In parts: controller host interface (CHI), protocol [20], the TTP/C protocol with bus and star topologies operation control (POC), coding and decoding has been investigated using SWIFI fault injection. (CODEC), media access control (MAC), frame and Here, the effects of the SOS failures in the bus and star symbol processing (FSP) and clock synchronization topologies with respect to the start of frame process (CSP). transmission have been studied. In [21] [22], a generic The CHI manages data and control flow between tool was developed for monitoring and diagnosis of a the host processor and the FlexRay protocol engine FlexRay-based system as well as for a CAN-based within each node. The CHI contains two major system. This tool has been used by the FlexRay interface blocks: the protocol data interface and the consortium to perform extended fault injection for message data interface. The protocol data interface evaluating of the FlexRay communication protocol. manages all data exchange relevant to the protocol One important limitation of this tool is that faults operation and the message data interface manages all cannot be injected inside different parts of the FlexRay data exchange relevant to the exchange of messages. protocol. The protocol data interface manages the protocol This paper evaluates the error propagation and configuration data, the protocol control data, and the message missing failures in the FlexRay protocol with protocol status data. The message data interface star topology. It evaluates the conditions that faults in manages the message buffers, the message buffer the FlexRay protocol disturb the sending or receiving configuration data, the message buffer control data, of messages at a node and cause a message does not and the message buffer status data. In addition, the CHI send or receive. In this condition a message missing provides a set of services that define self-contained failure occurs. This evaluation is done by 35680 bit- functionality that is transparent to the operation of the flip fault injection inside different parts of the FlexRay protocol [10]. protocol. To do this, a FlexRay communication The core parts of the protocol are moded by POC. controller was modeled by Verilog HDL at the Proper protocol behavior can only occur if the mode behavioral level. A FlexRay-based network composed changes of the core parts are properly coordinated and of four nodes was established using this controller by synchronized. The purpose of the POC is to react to star topology. The evaluations are done in two phases, host commands and protocol conditions by triggering at the first phase the percentages of faults resulting in coherent changes to core parts in a synchronous three kinds of errors, namely, content errors, syntax manner, and to provide the host with the appropriate errors and boundary violation errors are characterized. status regarding these changes [10]. The most sensitive and the less sensitive points of the The CODEC contains two sections: coding section FlexRay protocol to faults are identified. Then in the and decoding section. Coding section is responsible for second phase, by considering the error propagation encoding the communication elements into a bit stream results, the message missing failures are evaluated. In and how the transmitting node represents this bit this phase the relationship between the error stream to the bus driver for communication onto the propagation and message missing failure results are physical media. Decoding section is responsible for analyzed. Also, the message missing failure rate that receiving communication elements, make bit streams occurs in time-triggered or event-triggered window of and investigate correctness of bit streams. the FlexRay communication cycle are assessed. The The MAC controls access to the bus. In the FlexRay dependencies of fault locations (FlexRay parts) to this protocol, media access control is based on a recurring failure are also assessed. communication cycle. Within one communication This paper is organized in six sections. Section 2, cycle, FlexRay offers the choice of two media access introduces the FlexRay protocol, and section 3 presents schemes. These are a TDMA scheme and a FTDMA the message missing failures and error models found in scheme. The communication cycle is the fundamental this protocol. The experimental organization is given in element of the media access scheme within FlexRay. It section 4, and the results are presented in section 5. contains the static segment, the dynamic segment, the The last section concludes the work. symbol window and the NIT [10]. The FSP is the main processing layer between CODEC and CHI. This part checks the correct timing of received frames and symbols with respect to the TDMA scheme, applies further syntactical tests to received frames, and checks the semantic correctness receiving the messages from communication of received frames [10]. controller. Meanwhile, the host generates the message Finally, the CSP uses a distributed clock exactly as many as the IDs that has been allocated to its synchronization mechanism in which each node controller. It means that the number of generated