478

IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 8, NO. 2, FEBRUARY 2021 Joint Algorithm of Message Fragmentation and No- Wait Scheduling for Time-Sensitive Networks Xi Jin, Member, IEEE, Changqing Xia, Member, IEEE, Nan Guan, and Peng Zeng

Abstract—Time-sensitive networks (TSNs) support not only end-to-end delay constraints in hard real-time industrial traditional best-effort communications but also deterministic systems. communications, which send each packet at a deterministic time so that the data transmissions of networked control systems can In networked control systems, TSNs have been adopted [1]– be precisely scheduled to guarantee hard real-time constraints. [3]. Deterministic communications contain control commands No-wait scheduling is suitable for such TSNs and generates the and critical sensing data, and the other data adopts best-effort schedules of deterministic communications with the minimal communications [4]. Deterministic communications and best- network resources so that all of the remaining resources can be effort communications have different objectives. The real-time used to improve the throughput of best-effort communications. performance of deterministic communications is the most However, due to inappropriate message fragmentation, the real- time performance of no-wait scheduling algorithms is reduced. important. Before they start to transmit, their deterministic Therefore, in this paper, joint algorithms of message schedules must be generated so that the transmission process fragmentation and no-wait scheduling are proposed. First, a can be controlled to guarantee hard real-time constraints. For specification for the joint problem based on optimization modulo best-effort communications, there is no strict constraint. Their theories is proposed so that off-the-shelf solvers can be used to schedules do not need to be generated in advance, and networks find optimal solutions. Second, to improve the scalability of our algorithm, the worst-case delay of messages is analyzed, and then, just try to transmit them as soon as possible. In a TSN switch, based on the analysis, a heuristic algorithm is proposed to the two kinds of communications share a fixed number of construct low-delay schedules. Finally, we conduct extensive test output queues. Deterministic communications must be assigned cases to evaluate our proposed algorithms. The evaluation results dedicated queues, while best-effort communications require indicate that, compared to existing algorithms, the proposed joint more queues to improve network throughput [5]. Therefore, algorithm improves schedulability by up to 50 . % determining how to assign and utilize the queues are the key to Index Terms—Message fragmentation, networked control system, improving network performance. real-time scheduling, time sensitive network. This paper focuses on store-and-forward switching, because compared to cut-through switching, store-and-forward I. Introduction switching is supported by more off-the-shelf TSN products. IME-SENSITIVE networks (TSNs) are an emerging For example, CISCO IE 4000 [6] and NXP SJA1105 [7] Tindustrial network technology based on networks support only store-and-forward switching, while no off-the- and extend a set of IEEE standards to improve the shelf TSN product supports only cut-through switching. In controllability of industrial networks. Thus, in addition to store-and-forward networks, no-wait scheduling is an best-effort communications supported by Ethernet networks, effective method to make a performance trade-off between TSNs also support deterministic communications that have deterministic communications and best-effort communications been widely considered as an effective solution to guarantee [8]. Under no-wait scheduling, once a receives Manuscript received May 25, 2020; revised June 15, 2020; accepted June a packet, it sends the packet immediately, i.e., only one queue 24, 2020. This work was partially supported by National Key Research and is needed to cache the scheduled packets. Thus, when a no- Development Program of China (2018YFB1700200), National Natural Science Foundation of China (61972389, 61903356, 61803368, U1908212), wait scheduling algorithm is used to generate schedules for Youth Innovation Promotion Association of the Chinese Academy of deterministic communications, except the occupied queue, all Sciences, National Science and Technology Major Project of the other queues can be used by best-effort (2017ZX02101007-004), Liaoning Provincial Natural Science Foundation of China (2020-MS-034, 2019-YQ-09), and China Postdoctoral Science communications. Therefore, no-wait scheduling algorithms Foundation (2019M661156). Recommended by Associate Editor Zidong not only generate schedules for deterministic communications Wang. (Corresponding author: Xi Jin.) on the dedicated queue, but also improve the performance of Citation: X. Jin, C. Q. Xia, N. Guan, and P. Zeng, “Joint algorithm of message fragmentation and no-wait scheduling for time-sensitive networks,” best-effort communications. IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 478–490, Feb. 2021. However, when no-wait scheduling algorithms are adopted, X. Jin, C. Q. Xia, and P. Zeng are with Key Laboratory of Networked the following issue should be considered. On the IP layer, a Control Systems, Chinese Academy of Sciences, Shenyang 110016, and with Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang large message from the TCP layer must be fragmented into 110016, China, and also with Institutes for Robotics and Intelligent several smaller pieces. Each piece and its added packet header Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China (e- constitute a complete packet. These packets are then mail: [email protected]; [email protected]; [email protected]). transmitted separately and reassembled at destination devices. N. Guan is with Department of Computing, Hong Kong Polytechnic University, Hong Kong, China (e-mail: [email protected]). In this process, fragmentation algorithms will affect the Digital Object Identifier 10.1109/JAS.2021.1003844 transmission delay because smaller packets have higher JIN et al.: JOINT ALGORITHM OF MESSAGE FRAGMENTATION AND NO-WAIT SCHEDULING FOR TIME-SENSITIVE NETWORKS 479 parallelism. For example, in Fig. 1, a message of 1620 bytes two corollaries are proposed on how to construct low-delay must be fragmented into two pieces. In traditional Ethernet transmissions. fragmentation, the length of the first packet is equal to the Third, based on the two corollaries, a joint algorithm is maximum segment size (MSS) plus a packet header, and the proposed that uses packets from large to small to construct remaining part is contained in the second packet. Since no- schedules under hard real-time constraints. Thus, the proposed wait scheduling algorithms do not allow a time interval joint algorithm strikes a balance between saving resources and between two consecutive hops of a packet, the second packet temporality. Extensive test cases were run to evaluate the joint is postponed by the first packet and cannot be transmitted algorithm. The evaluation results demonstrate that, compared earlier. In another fragmentation, to reduce the long delay, the to existing algorithms, the proposed joint algorithm improves message can be fragmented into two pieces of 810 bytes each. schedulability by up to 50% and increases a small number of Although the number of packets is unchanged, the delay is packets. reduced by approximately one-third. Message fragmentation The rest of this paper is organized as follows: Section II exists in TSNs [5]. However, there is currently no work that reviews two categories of related work, including message considers message fragmentation algorithms and no-wait fragmentation and real-time scheduling algorithms. Section III scheduling algorithms together. This has led to some details the system model and problem. Section IV formulates optimizable solution spaces being ignored. the problem as an OMT specification. Section V proposes a heuristic algorithm to improve scalability of our algorithms. 1620 bytes Section VI evaluates the proposed algorithms based on A message extensive test cases. Section VII concludes the paper. The network A B C D II. Related Work

1460 bytes 160 bytes Much research has been proposed to improve the real-time MSS performance of industrial networks, e.g., real-time scheduling Traditional A−B B−C [9], resource allocation [10] and data recovery [11]. In this fragmentation Header C−D paper, we only focus on the real-time scheduling problem of 810 bytes 810 bytes industrial TSNs. The work in [5] formulates the basic A new A−B fragmentation B−C scheduling problem of TSNs as a specification, and then uses C−D Time an off-the-shelf solver to calculate schedules. Then, the work No-wait scheduling Reduced delay in [12] introduces the capacity limitation of TSN switches into the basic problem so that the proposed scheduling algorithms Fig. 1. Effect of message fragmentation on no-wait scheduling. can be adopted in actual systems. Based on the above work, the work in [13] proposes a loose scheduling algorithm to Therefore, in this paper, a joint algorithm is proposed that transmit massive data packets using limited switch resources. fragments deterministic messages into packets of optimized In addition to the classical scheduling problem, some sizes and schedules the packets under hard real-time extended problems have been considered. The work in [14] constraints based on a no-wait strategy to improve the real- proposes an incremental scheduling algorithm to handle time performance of TSNs. When designing the joint dynamic data flows. The work in [15] focuses on the runtime algorithm, the following challenge is present. In the interest of reconfiguration of TSNs and designs a heuristic algorithm to saving resources, large packets should be used to reduce the minimize the impact of reconfiguring switches on existing overhead of fragmenting and reassembling. However, in the data flows. The work in [16] proposes a joint routing and interest of temporality, a message should be fragmented into scheduling algorithm to guarantee the real-time performance small packets, because small packets have higher parallelism of time-triggered communications and reduce the end-to-end and are easy to schedule. Thus, to address the above delays of audio-video- (AVB) communications. The challenge, our proposed algorithms make a trade-off between work in [17] also focuses on the joint problem of routing and these two aspects. The specific contributions are as follows. scheduling and evaluates extensive test cases to explore the First, a specification for the joint problem based on practical limitations of the integer linear programming for optimization modulo theories (OMT) is proposed. Since the TSNs. The work in [18] evaluates asynchronous scheduling joint problem is non-deterministic polynomial-time hard (NP- algorithms in TSNs and demonstrates that asynchronous hard), there is no polynomial time algorithm for finding scheduling algorithms are suitable for sporadic optimal solutions. Thus, we formulate the problem as an OMT communications. The work in [19] analyzes the worst-case specification with real-time constraints to minimize the delays of AVB communications based on network calculus, number of fragmented packets, and then invoke off-the-shelf and then the real-time performance of AVB communications solvers, e.g., Microsoft Z3, to solve it. Our evaluations can be precisely assessed. No-wait scheduling can indicate that, in an acceptable time, the OMT-based method significantly improve system efficiency and has been widely can find optimal solutions for small networks. researched in many industrial systems, e.g., energy-efficient Second, to improve the real-time performance of large production planning systems [20], metal-processing systems networks, we calculate the worst-case delays of messages [21] and flexible manufacturing systems [22]. In [8], no-wait based on the recursive function shown in Theorem 2. Then, scheduling is introduced into TSNs. However, the objective of 480 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 8, NO. 2, FEBRUARY 2021 the above work is to minimize the total length of generated example, as shown in Fig. 3, at time 1, the table entry that schedules. In this paper, communications have different real- controls port A: queue 0 is opened, and the other queues are time requirements, and the above work cannot be adopted. closed. The queue with a smaller ID has lower priority. When Message fragmentation is a widely used network more than one gate is opening, the packet buffered in the technology. In Ethernet networks, to reduce the message higher-priority queue can occupy the egress port. As described response time, the work in [23], [24] optimizes and re- in the IEEE 802.1bQu standard, the packet in the higher- configures the maximum transmission unit size. The work in priority queue is allowed to interrupt the process of sending [25] designs a middleware for Ethernet networks to fragment the packet in the lower-priority queue, and the interrupted large objects so that the worst-case delay of object-oriented packet will be re-assembled in the next switch. In no-wait transmissions can be reduced. The work in [26] evaluates four scheduling, deterministic packets are assigned to the highest- fragmentation methods based on Ethernet networks. In priority queue, and the queue is always opened such that the wireless local area networks, the work in [27], [28] designs packets will not be blocked. Although blocking packets may novel fragmentation algorithms to improve the throughput of reduce conflicts and improve schedulability, our work still streaming videos. In the 6LoWPAN protocol, the work in adopts no-wait scheduling for two reasons. First, no-wait [29], [30] demonstrates that packet fragmentation is beneficial scheduling is not limited by the size of the schedule table. In to energy consumption, throughput, and end-to-end delay. In switch chips, the size of the schedule table is limited and delay tolerant networks, the work in [31] fragments large data fixed. Thus, switches support a limited number of gate items to multiple routing paths such that routing performance operations. Since no-wait scheduling algorithms do not can be improved. In community antenna television networks, operate gates, they do not use the schedule table and do not the work in [32] considers fragmentation and scheduling need to consider the table size constraint. Second, under no- together. It first schedules real-time packets and then wait scheduling, there is no schedule dissemination overhead. fragments best-effort packets to make them suitable for the In a traditional network, a centralized scheduler disseminates free slots between successive real-time packets. In time- the generated schedules to the corresponding network devices. triggered Ethernet (TTEthernet) networks, a similar method is However, under no-wait scheduling, switches forward the proposed to fragment event-triggered messages to make them received packets immediately and do not need schedule schedulable in idle time intervals [33]. However, the above information, so there is no schedule dissemination overhead. work cannot be applied to the multi-hop no-wait scheduling We ignore the other details of TSN switches since they have problem that is addressed in this paper, and there is no work to nothing to do with our proposed algorithms. Earlier works consider the effect of message fragmentation on no-wait [13], [34] can help readers understand TSN switches in scheduling. Therefore, in this paper, this effect will be details. analyzed, and joint algorithms will be proposed to improve Time 1, Port A: 1000 0000 network performance. Schedule Time 2, Port A: 0100 0000 table Time 3, Port B: 0011 1111 III. System Model and Problem Statement In this section, we will detail our system model and Queue 0 Gate problem. The symbols used in this paper are summarized in Port A Appendix. (ingress) Queue 1 Gate Port A A TSN is characterized by a two-tuple < N, L >. The node (egress) set N = {n ,n ,...} includes TSN switches with multiple ports Switch 1 2 engine Queue 7 Gate and end systems with only one port. The element li, j in the link set L denotes a cable between nodes n and n . TSN Port B Port B i j (ingress) switches form a mesh topology, and then end systems are (egress) connected to the TSN switches, as shown in Fig. 2. All ports and cables are full-duplex. Fig. 3. Architecture of a TSN switch.

4-port switch Transmitting methods are designed herein only for l1, 4 n1 deterministic flows. Best-effort flows are delivered by the n4 classical best-effort services. In the following, if no specific End system l4,5 l4,6 description is given, flows refer to deterministic flows. A flow

n2 n3 in the flow set F is characterized by a four-tuple n5 l n6 l2, 5 5,6 l3, 6 < pi,di,si,Πi >, which denotes the period, relative deadline, message size, and routing path, respectively. Flow fi generates Fig. 2. A time sensitive network. a message of si bytes periodically with a period pi, and its

relative deadline di is not greater than its period pi. It is In TSN switches, each egress port connects multiple queues, assumed that all flows generate their first messages at time 0. and each queue has a gate that can accurately control the The j-th message of flow fi, called message mi, j, is generated sending of packets. Each TSN switch has a schedule table that at time j × pi, and its absolute deadline is j × pi + di. The time records the precise times to open and close the gates. For interval [ j × pi, j × pi + di) is called its active interval. Only JIN et al.: JOINT ALGORITHM OF MESSAGE FRAGMENTATION AND NO-WAIT SCHEDULING FOR TIME-SENSITIVE NETWORKS 481 determining how to transmit data in a hyperperiod H that is to determine whether a specification is satisfiable or not [39]. defined as the least common multiple of all periods, i.e., An OMT is an extension of a SMT and allows finding an H = LCM{p1, p2,...}, is considered here, after the first optimal objective based on a SMT specification. hyperperiod, the subsequent hyperperiods are periodically The inputs of the problem are network N and flow set F. To repeated. The transmission time of a packet with x bytes is formulate the problem, we assume that a message is / equal to x v, where v denotes the transmission speed. The fragmented into, at most, U packets, which are denoted as routing path Πi is a link set, and Πi ⊆ L. A path is from an end {τi, j,1,τi, j,2,...,τi, j,U }. U is given by users. Three decision system to another end system. Hence, the number of hops, variables wi, j,g, si, j,g and ui, j,g are introduced. wi, j,g denotes the denoted ri and ri = |Πi|, is not less than 2. Since routing is injection time of packet τ , , . The size of packet τ , , is s , , already well-studied, in the proposed model all routing paths i j g i j g i j g bytes. If s , , = 0, τ , , is defined as invalid; otherwise, it is are the shortest and generated based on existing algorithms, i j g i j g e.g., [35], [36]. valid. The objective of this work is to minimize the number of From Fig. 1, it can be seen that message fragmentation valid packets. To calculate the number of valid packets, we τ based on the fixed MSS is not suitable for improving real-time use ui, j,g to denote whether a packet is valid or not. If i, j,g is > = = performance. To address this problem, first, the sizes of valid, i.e., si, j,g 0, then ui, j,g 1; otherwise, ui, j,g 0. Thus, deterministic packets are changed based on the requirements the objective function is of no-wait scheduling, and then the packets are injected at a ∑ min ui, j,g. specified time that is determined by the proposed real-time H (1) ∀ fi∈F, ∀ j∈[0, ), ∀g∈[0,U) scheduling algorithm. However, the number of deterministic pi packets significantly affects network performance because the The minimizing problem respects the following constraints. fragmented deterministic packets not only introduce extra 1) Range Constraint: The ranges of the variables used in the packet headers but also interrupt best-effort packets. proposed model are the following:

Therefore, given a network and a flow set, our proposed joint H algorithm of message fragmentation and scheduling is to ∀ fi ∈ F, ∀ j ∈ [0, ), ∀g ∈ [0,U) minimize the number of packets such that the following two pi ≤ ≤ , ∈ { , } requirements can be satisfied. 0 si, j,g MSS ui, j,g 0 1 1) Real-Time Requirement: All messages must be delivered j × pi ≤ wi, j,g ≤ j × pi + di to destinations before their absolute deadlines. wi, j,g ≤ wi, j,g+1, ui, j,g ≥ ui, j,g+1. (2) 2) No-Conflict Requirement: A unidirectional link cannot If the size of a packet is greater than MSS, the packet has to serve more than one packet at the same time. be fragmented again on the IP layer. Thus, the upper bound of A flow set is called schedulable if it has a feasible schedule the packet size is MSS. The injection time of a packet must be that meets all the requirements. A solution is called optimal if between its generation time and deadline. To reduce the there are no other solutions with better objective values. When search space, we specify the order of packets: the packets are only linear networks are considered, and messages are too injected in increasing order of ID, and valid packets have small to be fragmented, our problem is the same as the scheduling problem studied in [37]. Since the problem in [37] smaller IDs than invalid packets. is NP-hard, our problem is also NP-hard. Therefore, to solve 2) Size Constraint: A message is fragmented into several the problem, we propose two methods as shown in Fig. 4. packets, and the sum of the size of these packets is equal to First, the problem is formulated as an OMT specification [38] the size of the message. For each packet, based on the (in Section IV). Thus, if there exists a feasible solution, OMT definition of ui, j,g, if the size of the packet is 0, the solvers can find it. However, for some complex networks, the corresponding ui, j,g is also 0; otherwise, it is 1. The size solvers cannot find any feasible solution in an acceptable time. constraints are as follows: ∑ Then, to improve the scalability of our algorithms, a heuristic H algorithm is proposed (in Section V). ∀ f ∈ F, ∀ j ∈ [0, ), s , , = s , i p i j g i i ∀g∈[0,U) Section IV Specification ∀g ∈ [0,U), ((si, j,g > 0) ∧ (ui, j,g = 1))∨ OMT specification ((1)−(6)) + Solver Our ((si, j,g = 0) ∧ (ui, j,g = 0)). (3) problem Section V-A 3) Real-Time Constraint: If a packet is invalid, i.e., ¬ui, j,g is Heuristic Basic scheduling algorithm (Algorithm 1) true, its real-time constraints do not need to be considered; algorithm Section V-B otherwise, its arrival time cannot be later than its deadline. Delay analysis based on Algorithm 1 The arrival time of a packet is equal to the injection time plus (Theorem 1, 2 and Corollary 1, 2) the transmission time ((si, j,g + e)/v × ri), where e is the size of Reduce end-to-end delays Section V-C a packet header. Thus, the real-time constraints are as follows:

Joint algorithm (Algorithms 2 and 3) H ∀ fi ∈ F, ∀ j ∈ [0, ), ∀g ∈ [0,U) Fig. 4. Overview. ( pi ) si, j,g + e ¬ui, j,g ∨ wi, j,g + × ri ≤ j × pi + di . (4) IV. OMT Specification v Satisfiability modulo theories (SMT) have been widely used 4) No-Conflict Constraint: For any two packets, under the 482 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 8, NO. 2, FEBRUARY 2021

following three conditions, we do not need to consider the the packets belonging to message mi, j first. Note that in conflict between them. First, the two packets are the same Algorithm 1 it is assumed that all priorities ρi, j and message packet, i.e., for τi, j,g and τx,y,z, (i = x) ∧ ( j = y) ∧ (g = z) is true. fragmentation {τi, j,1,τi, j,2,...} have been obtained. The method Second, not both packets are valid, i.e., ¬ui, j,g ∨ ¬ux,y,z is true. of obtaining these is introduced in Section V-C. Third, the two packets do not pass through the same directed link, i.e., (la,b < Πi) ∨ (la,b < Πx) is true. If the two packets do Algorithm 1 Initial scheduling algorithm not satisfy any of the above conditions, the time intervals Input: < N, L >, F, ∀ρ , , ∀s , , , ∀u , , when they occupy the same link cannot overlap; otherwise, i j i j g i j g Output: ∀wi, j,g they conflict with each other. The constraints are as follows: H 1: ∀ fi ∈ F, ∀ j ∈ [0, ),Γ = Γ + {mi, j}; pi

H 2: while Γ , ∅ do ∀l , ∈ L, ∀ f , f ∈ F, ∀ j ∈ , a b i x [0 ) 3: mi, j with the smallest ρi, j is the highest-priority message in Γ ; pi 4: Γ = Γ − {mi, j}; ∀ ∈ , H , ∀ , ∈ , y [0 ) g z [0 U) 5: for each τi, j,g ∈ Mi, j do px si, j,g + e 6: for t = ( j × pi) to ( j × pi + di − × ri) do ((i = x) ∧ ( j = y) ∧ (g = z))∨ v 7: if NoCon f lict(t,τi, j,g) then = ¬ui, j,g ∨ ¬ux,y,z ∨ (la,b < Πi) ∨ (la,b < Πx)∨ 8: wi, j,g t; break; τ sx,y,z + e 9: if i, j,g is not scheduled then (A , , (l , ) + < A , , (l , ))∨ x y z a b v i j g a b 10: return FAIL; si, j,g + e 11: return ∀wi, j,g; (Ai, j,g(la,b) + < Ax,y,z(la,b)) (5) v Γ contains all of the unscheduled messages (Lines 1 and 4). where Ai, j,g(la,b) denotes the time of starting to transmit the In each iteration (Lines 2–10), Algorithm 1 calculates the packet on link la,b, i.e., injection time of the highest-priority message in Γ until there , , are no unscheduled messages. For message mi j, the set Mi j s , , + e i j g includes all of its valid packets. The packets in Mi, j are Ai, j,g(la,b) = wi, j,g + × (qi,a,b − 1). (6) v assigned injection times (Lines 5–8) separately. If at time t a packet can be scheduled without conflict (the function where qi,a,b denotes that la,b is the qi,a,b-th hop in path Πi. NoCon f lict() in Line 7), its injection time is t (Line 8). The Then, Ai, j,g(la,b) + (si, j,g + e)/v is the finish time of the packet function NoCon f lict(t,τi, j,g) returns true if, when τi, j,g is on link la,b. OMT solvers can find the solution of the above specifica- injected into the network at time t, there is no conflict between τ tion. However, for complex problems, the execution time of i, j,g and the other scheduled packets. off-the-shelf solvers is unacceptable. Therefore, in the next Then, we analyze the time complexity of Algorithm 1. Lines 4 and 8–10 take constant time. The number of iterations of the section, a fast algorithm is proposed to solve the problem. while loop∑ in Line 2 and the for loop in Lines 5 and 6 are, at / V. Proposed Algorithm most, ∀ fi∈F(H pi), U, and pi, respectively. The time Our algorithms schedule messages based on fixed priority complexity of NoCon f lict() is O(|N|). Line 1 is less complex scheduling [40], which is effective and has been widely used than the other lines and is ignored. Therefore, the time in real-time systems. Under fixed-priority scheduling, each complexity of Algorithm∑ 1 is determined by Lines 2, 5, 6 and / × × × | | = | |× message has an unique priority, and all the messages are NoCon f lict(), i.e., O( ∀ fi∈F(H pi) U pi N ) O( F × × | | scheduled in decreasing order of their priorities. For easy H U N ), where U is given by users. understanding, first, in Section V-A, we assume that priorities and message fragmentation have been determined. Based on B. Delay Analysis the precondition, a basic fixed-priority scheduling algorithm The end-to-end delay of a message is defined as the time (Algorithm 1) can calculate the injection times for all packets. duration between its generation time and its received time. In Then, in Section V-B, based on the basic scheduling the following, analyzing the end-to-end delay of the message algorithm, the worst-case delays of messages are analyzed in mi, j is considered. Based on Algorithm 1, it is known that the order to determine what strategy can improve the real-time messages in the conflict set Ω(mi, j) (Definition 1) determine performance. Finally, in Section V-C, based on the analysis, the delay of mi, j. When all the messages of Ω(mi, j) are our joint algorithm to fragment messages and schedule transmitted in the active interval of mi, j and before the transmissions of m , , the worst-case delay of m , occurs. packets is proposed (as shown in Algorithm 3). i j i j S (mi, j) is used to denote the packets that belong to the A. Basic Scheduling messages of Ω(mi, j), and S (mi, j) = {τa,b,g|∀τa,b,g ∈ Ma,b,∀ma,b ∈ To find the main factors that affect transmission delay, how Ω(mi, j)}. packets are scheduled is presented in this section. Definition 1: Ω(mi, j) includes all of the messages ma,b that Algorithm 1 is used to calculate the injection time for each satisfy the following conditions: packet. Each message mi, j has a priority ρi, j. If ρi, j < ρx,y, then 1) ma,b has a higher priority than mi, j, i.e., ρa,b < ρi, j. the scheduling algorithm calculates the injection time wi, j,g of 2) ma,b and mi, j use the same links, i.e., Πa ∩ Πi , ∅. JIN et al.: JOINT ALGORITHM OF MESSAGE FRAGMENTATION AND NO-WAIT SCHEDULING FOR TIME-SENSITIVE NETWORKS 483

(s +e)/v 3) the active intervals of ma,b and mi, j overlap each other, a, b Time t i.e., [b × p ,b × p + d ) ∩ [ j × p , j × p + d ) , ∅. Πi a a a i i i l Based on Definition 1, it is known that the higher-priority 1 l lxy−1 l l2 x1 xy−1 messages in Ω(mi, j) pass through the links of Πi, but maybe l3 l l not all of them. However, as long as one of the hops of mi, j is l x2 xy lxy ... 4 delayed by the higher-priority messages, the entire mi, j is ...... delayed. This is the same as transmitting higher-priority (a) Path extension (b) Assumption 1 (c) Assumption 2 messages over the entire path. Hence, to bound the worst case, Fig. 5. Illustration of higher-priority message m , . all of the higher-priority messages are extended to the entire a b path Πi, i.e., when the delay of mi, j is considered, the routing ⃗ , path of a higher-priority message ma,b is also Πi. This path of packets. Set S (mi j) includes two parts: the ordered packets extension does not result in an overestimation of the worst of S (mi, j) denoted by {τ1,τ2,...,τσ}, where σ = |S (mi, j)|, and case or an underestimation of the conflicts from the higher- the packets of mi, j denoted by {τσ+1,...,τσ+ς}, where priority messages (as shown in Theorem 1). Intuitively, ς = |Mi, j|. For any two packets τk and τk+1, τk is transmitted changing the path will cause more delay, or some conflicts before τk+1. Then, the worst case delay of mi, j can be may be neglected. However, in Theorem 1, we proved that the calculated based on Theorem 2. above situations do not exist. Theorem 2: The generation time of mi, j is set to time 0, and Theorem 1: In the proposed model, the path extension for the worst-case delay of mi, j is equal to the relative finish time the messages in set Ω(mi, j) does not exacerbate the worst-case of the last packet τσ+ς bounded by a recursive function delay and does not lose the conflicts from the higher-priority Delay(m , ) = Finish(τσ+ς) messages. i j = Finish(τσ+ς− ) + δ(τσ+ς) (7) Proof: First, the worst case is discussed. The path extension 1 does not allow messages to be transmitted in parallel since all where of them occupy the same links. Owing to the conflicts s + e Finish(τ ) = r × 1 (8) introduced by other messages, when the delay of mi, j is 1 i v analyzed, the start times of the higher-priority messages in and Ω (m , ) are unknown and can be any time. Thus, even though i j  + +  sk e sk e the paths are not extended, it is possible that the transmission  + ( − ε), if k ≤ σ and s ≤ s −  v v k k 1 times of the messages in Ω(mi, j) do not overlap each other,   s + e s − + e s − + e i.e., they are serial. Therefore, the path extension does not r × k − (r − 1) × k 1 + ( k 1 − ε),  i v i v v exacerbate the worst case.   if k ≤ σ and sk > sk−1 Then, the conflicts from the higher-priority message ma,b δ(τk) =   sk + e Π = { ,..., }  , > σ ≤ − are discussed. It is assumed that i l1 lri , and  if k and sk sk 1 Π ∩ Π = { , ,..., } ≤ < < ··· < ≤  v i a lx1 lx2 lxφ (1 x1 x2 xφ ri). After  + +  sk e sk−1 e extending the path, the transmissions of m , are shown in ri × − (ri − 1) × , a b  v v Fig. 5(a). To simplify the description, a message includes only  if k > σ and sk > sk−1. one packet. If a message includes multiple packets, our proof (9) is still available. It is assumed, to the contrary, that the Proof: The finish time of a packet is equal to the finish time extended path leads to reducing the conflicts, i.e., part of the of its previous packet plus its own delay δ. original transmissions of m , that cannot be covered by the a b {τ ,...,τ } extended transmissions introduces more conflicts. When there First, the calculation in the subset 1 σ is explained. τ exists only one link l in Π ∩ Π , only one transmission must For the first packet 1, its finish time is equal to its x1 i a τ ≤ ≤ σ be covered by the extended transmission since the link is in transmission time (see (8)). Then, for packet k (2 k ), the extended path and the message size is unchanged. When the following two conditions correspond to the first two lines δ τ multiple links are in Πi ∩ Πa, the transmission on lx can be of ( k). 1 ≤ σ ≤ covered by the extended transmissions. It is assumed that the Condition 1: k and sk sk−1. If there is no other < ≤ φ conflict outside Π , τ should be transmitted after τ − in transmission on lxy (1 y ) is the first transmission that i k k 1 cannot be covered (as shown in Figs. 5(b) and 5(c)). Note that succession (as shown in Fig. 6(a)). However, some external the proposed model adopts the shortest routing paths. The factors, such as the path outside Πi and the unknown Π τ number of the hops between lxy−1 and lxy in a is the same as relationship between the generation time of k and τ τ that in Πi, i.e., xy − xy−1 − 1. Then, the start time of the transmission time of k−1, may make k more delayed. This + − − × + / transmission on lxy is t (xy xy−1 1) (sa,b e) v, where t is additional delay is called a gap (as shown in Fig. 6(b)). When + / τ the end time of the transmission on lxy−1. This time is equal to the gap length is less than (sk e) v, the time duration of k the start time of the transmission on the extended path, which occupying l2 overlaps the time duration of τk−1 occupying l3. contradicts our assumption. Therefore, our path extension Thus, between τk and τk−1, no packet can be transmitted does not underestimate the conflicts from higher-priority continuously on l2 and l3. The gap can only be idle. In the messages. ■ worst case, the gap length is ((sk + e)/v) − ε, where ε ⃗ A packet set S (mi, j) is used to denote the transmitting order approaches 0. If the gap length is greater ((sk + e)/v) − ε, then 484 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 8, NO. 2, FEBRUARY 2021

Overlap Link Link ? then consider the relationship between them. (s +e)/v τk−1 τ k (s +e)/v k l1 τk−1 τk k For S (mi, j), the first two conditions of (9) are rewritten as Πi Πi l 2 follows.   , ≤ σ ≤ l3 0 if k and sk sk−1  sk + e  sk + e sk−1 + e Gap=(sk+e)/v−ε δ τ = − ε + − × − , ( k) 2 (ri 2) ( ) (11) Time Time v  v v Finish (τk−1) Finish (τk−1)  if k ≤ σ and sk > sk−1. (a) Best case (b) Worst case Condition 2 is not less than zero because ri ≥ 2 and > Fig. 6. Illustration of Condition 1. sk sk−1. Therefore, when all packets satisfy Condition 1, i.e., the packets are transmitted in a monotonic non-increasing τ l2 and l3 can be used continuously, and a subsequent packet order of sizes, the value of Finish( σ) is the minimum as can be inserted. Although it is possible that due to the delays follows ∑ introduced by external factors the subsequent packets in sk + e ⃗ Finish(τσ) = Finish(τ1) + 2 − ε. (12) S (mi, j) cannot be inserted, the packets of mi, j can be inserted v ∀τk∈S (mi, j)−{s1} because they do not have the path outside Πi. Therefore, the delay of δ(τk) is its transmission delay plus the worst-case For Mi, j, when all of its packets are∑ transmitted without + / gap. gaps, the minimum delay occurs, i.e., ∀τk∈Mi, j (sk e) v. This Condition 2: k ≤ σ and sk > sk−1. If the two packets can be is the same as Condition 3 of (9) in which the packets are in a transmitted in succession, the finish time of τk should be monotonic non-increasing order of sizes. τ Finish(τk−1) + ri × (sk + e)/v − (ri − 1) × (sk−1 + e)/v (as shown Then, we consider the relationship between σ (the last in Fig. 7(a)). However, due to unknown conflict, there may be packet of S (mi, j)) and τσ+1 (the first packet of Mi, j). − + / − ε If sσ+1 ≤ sσ, then a gap (sk 1 e) v between the two packets (as shown in ∑ s + e Fig. 7(b)). Hence, the finish time of τk is = τ + k Delaysσ+1≤sσ (mi, j) Finish( σ) ( ) ∀τ ∈ v sk + e sk−1 + e sk−1 + e k Mi, j Finish(τk−1) + ri × − (ri − 1) × + − ε . ∑ v v v s1 + e sk + e = (ri − 2) × + 2 × Then, for the packets of mi, j, since all of their conflicts have v ∀τ ∈ v ⃗ k S (mi, j) been included in S (mi, j), no unknown conflict results in gaps. ∑ + + sk e − ε. Thus, in Conditions 3 and 4, δk does not contain gap delays, (13) ∀τ ∈ v and the others are the same as Conditions 1 and 2. ■ k Mi, j If sσ+1 > sσ, then

σ+ + (Pi−1) ·(sk−1+e)/v s 1 e (Pi−1) ·(sk−1+e)/v Delay > (m , ) = Finish(τσ) + r × sσ+1 sσ i j i v + ∑ + P ·(s +e)/v Gap=(s +e)/v P ·(s +e)/v sσ e sk e i k k−1 −ε i k − r − × + Link ( i 1) Link v ∀τ ∈ −{τ } v τ τ k Mi, j σ+1 k−1 k τk−1 τk ∑ Π sk + e i Πi = Finish(τσ) + v ∀τk∈Mi, j + + Finish ( ) Time Finish ( ) Time sσ+1 e sσ e τk−1 τk−1 + (r − 1) × − (r − 1) × i v i v (a) Best case (b) Worst case + = + − × sσ+1 e Delaysσ+1≤sσ (mi, j) (ri 1) Fig. 7. Illustration of Condition 2. v sσ + e − (r − 1) × . (14) In Theorem 2, the impact of fragmentation and scheduling i v on the worst-case delay is formulated. Based on the theorem, < − × Hence, Delaysσ+1≤sσ (mi, j) Delaysσ+1>sσ (mi, j) since (ri 1) we can prove Corollary 1 and Corollary 2. Then, the worst- (sσ+1 + e)/v > (ri − 1) × (sσ + e)/v. Therefore, when all the case delay can be reduced by constructing a proper solution packets are in a monotonic non-increasing order of sizes, (13) based on the two corollaries. can be used to calculate the minimum delay, where s1 denotes ⃗ Corollary 1: When the packets of S (mi, j) are in a monotonic the maximum size. ■ non-increasing order of sizes, the minimum value of Corollary 2: Given a fixed number of packets, the value of Delay(mi, j) can be obtained as follows: Delay(mi, j) can be further decreased by minimizing smax. Proof: For the right-hand side of (10), the fourth term smax + e Delay(m , ) = (r − 2) × + approaches zero and can be ignored. The second term is i j i v ∑ ∑ rewritten as sk + e sk + e 2 × + − ε (10) ∑ ∀τ ∈ v ∀τ ∈ v 2 k S (mi, j) k Mi, j × (s , + |M , | × e). v a b a b ⃗ ∀m , ∈Ω(m , ) where smax is the maximum size in S (mi, j). a b i j Proof: First, we discuss S (mi, j) and Mi, j, respectively, and Since |Ma,b| is given, the second term is a fixed value. JIN et al.: JOINT ALGORITHM OF MESSAGE FRAGMENTATION AND NO-WAIT SCHEDULING FOR TIME-SENSITIVE NETWORKS 485

Similarly, the third term is also a fixed value. Therefore, only of JA and JA-EN also demonstrates this property. smax in the first term can be reduced to decrease the value of

Delay(mi, j). ■ Algorithm 2 PriorityAssignment (F)

C. Joint Algorithm Input: F Output: Ω⃗ The proposed joint algorithm is based on Corollary 1 and H 1: ∀ fi ∈ F, ∀ j ∈ [0, ),Γ = Γ + {mi, j}; Corollary 2. Packets are scheduled in non-increasing order of ∑ pi H 2: for each ρ = ∀ f ∈F to 1 do their sizes as described in Corollary 1. Thus, only when i pi ∈ Γ messages cannot be scheduled, the size of packets is reduced 3: for each mi, j do Γ to improve parallelism, and the number of packets is 4: assume that except mi, j, all of the others in have higher increased. This also conforms to our objective of minimizing priorities than mi, j; ′ ≤ × + the number of packets (see (1)). Then, messages are 5: if Delay (mi, j) j pi di then ρ = ρ Γ = Γ − { } fragmented into packets of s¯ bytes, where s¯ is a variable and 6: i, j ; mi, j ; break; 7: else , corresponds to smax of Corollary 1. If message mi j is ′ 8: χi, j = Delay (mi, j) − ( j × pi + di); unschedulable, the schedules of S (mi, j) are rolled back, and s¯ ρ is reduced such that the delay can be decreased as described in 9: if no message is assigned then ρ χ Corollary 2. In our heuristic algorithms, we do not need to 10: assign to the message with the minimum i, j ; Γ = Γ − { } consider invalid packets, which are only used in the OMT 11: mi, j ; Ω⃗ specification. 12: store messages in decreasing order of priority in set ; Ω⃗ If the last packet of a message is less than s¯ bytes, it is 13: return ; enlarged to s¯ bytes to satisfy Corollary 1. Intuitively, The proposed joint algorithm is shown in Algorithm 3. First, enlarging packets will increase delay. However, if the packet function PriorityAssignment() (Algorithm 2) is invoked to is not enlarged, it may introduce more delay to the subsequent assign priorities. PriorityAssignment() is an extension of the ′ packets. For example, s bytes are padded into a packet of sk classical priority assignment in [41]. The classical priority bytes. Then, based on (9), the delay of packet sk+1 is assignment traverses priorities from lowest to highest (Line 2) ′ and assigns a priority to a task if the task satisfies the sk + s + e sk+1 + e D = Finish(τk−1) + 2 × − ε + following condition: when all other unassigned tasks have v v s′ higher priorities than the task, the delay of the task is less than = B+ 2 × (15) its deadline (Lines 3–6). Under this assignment policy, the v initial order of all messages does not affect schedulability, where which has been proved in [41], [42]. The classical priority + + sk e sk+1 e assignment can reach an optimal solution only if the delay is B = Finish(τk−1) + 2 × − ε + . (16) v v exact. Our analyzed delay is the upper bound of the actual If the packet is not enlarged, the delay is delay. Therefore, if the analyzed message delay is less than

′ sk + e the deadline, then the assignment policy must make the D = Finish(τ − ) + 2 × − ε k 1 v message schedulable. s + + e s + e The prerequisite of calculating Delay() is the known message + r × k 1 − (r − 1) × k i v i v fragmentation. However, before PriorityAssignment(), s + − s message fragmentation is not determined. Thus, Delay() is = B+ (r − 1) × k 1 k . (17) i v changed as follows: ′ − The upper bounds of s and sk+1 sk are both s¯. Then, ′ MSS + e < + × / ′ < + − × / Delay (mi, j) = (ri − 2) × D B 2 s¯ v, and D B (ri 1) s¯ v. The upper bound v ′ ∑ of D is fixed, while D increases with path length ri. s , + 2 × + ⌈ a b ⌉ × Therefore, when a path includes more than three hops, (sa,b λ e) v ∀ ∈Ω enlarging the packet to s¯ can reduce the delay of the ma,b (mi, j) , subsequent packets. As shown in Fig. 8, when τk is enlarged, 1 si j + × (si, j + ⌈ ⌉ × e) − ε (18) the delay of τk+1 is reduced. In Section VI-B, the comparison v λ where λ is the minimum packet size and given by users. ′ τk−1 τk τk+1 Delay () is the upper bound of Delay() because MSS and Original packets ⌈sa,b/λ⌉ is the upper bound of smax and |Ma,b|, respectively. Thus, if Delay′() is less than the deadline, the current priority Enlarged τ k ρ , Enlarging packets can make message mi j schedulable (Lines 4–6); otherwise, the deviation, denoted as χi, j, between delay and deadline is calculated (Lines 7–8). Then, if the current priority is not Reduced delay assigned to any message, it is assigned to the message with the

minimum χi, j (Lines 7–11) because the message is most likely Fig. 8. Enlarging τ to reduce the delay of τ + . k k 1 schedulable at the current priority level. Algorithm 2 returns a 486 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 8, NO. 2, FEBRUARY 2021 message set Ω⃗ that includes all of the messages sorted in number of packets is our objective as shown in 1); and 3) decreasing order of priorities (Lines 12–13). execution time is the time required to generate a feasible Lines 4, 6, 9, 11 and 13 take constant time. ∑The number of solution. iterations of the for loop in Lines 2 and 3 are ∀ f ∈F H/pi. In The proposed joint algorithm (JA) and OMT-based method ′ i ∑Delay () and Delay∑(), the running time of the summations are are compared with the following five methods: ME, ME+AD, / / × ∀ fi∈F H pi and ∀ fi∈F H pi U , respectively. In addition, ME+EN, JA-EN and BL. the∑ time complexities of Lines 1, 10 and 12 are 1) ME adopts the traditional Ethernet fragmentation and the / | | O( ∀ fi∈F H pi), O( F ), and O(nlogn), respectively. However, earliest deadline first (EDF) scheduling algorithm [40]. they are less complex than the other lines and are ignored. 2) ME+AD is an extension of ME. If ME cannot schedule a Therefore, the time complexity of this function is determined test case under the initial MSS, ME+AD decreases the MSS 3 × ∆ by Lines 2, 3 and Delay(), i.e., O(∑M U), where M is the by the same parameter as JA and re-schedules the test case. = / Repeat this process until the test case can be scheduled. total number of messages, and M ∀ fi∈F H pi. 3) ME+EN is also an extension of ME. The only difference Algorithm 3 Joint algorithm between ME+EN and ME is that ME+EN enlarges the last packets of messages, while ME does not. Input: F, ∆ 4) JA-EN is the same as JA except that it does not enlarge Output: ∀M , , ∀w , , i j i j g the last packets. 1: Ω⃗ = PriorityAssignment(F); 5) BL is a baseline for comparisons. To illustrate the 2: s¯ = MSS; k = 1; efficiency of JA, an optimal result is needed. However, in an 3: while k ≤ |Ω⃗ | do acceptable time, off-the-shelf solvers can only find optimal 4: fragment the k-th message, denoted mi, j, into multiple packets results for simple problems. Thus, when optimal results of s¯ bytes; cannot be found, an approximately optimal result is needed, 5: for each τi, j,g ∈ Mi, j do + = × × + − si, j,g e × which is BL in our evaluations. When a schedulable ratio is 6: for t ( j pi) to ( j pi di v ri) do considered, BL is defined as the percentage of test cases that 7: if NoCon f lict(t,τi, j,g) then satisfy the following necessary condition: if all port 8: wi, j,g = t; break;

9: if ∃τi, j,g ∈ Mi, j is not scheduled then utilizations are not larger than 100%, the test case is

10: k = the smallest ID in Ω(mi, j); schedulable. When the comparison metric is the number of 11: ∀g ∈ [k,|Ω⃗ |], clear the injection times of the g-th message; packets, BL is the sum of packets that are fragmented by the //*schedules roll back*// traditional Ethernet fragmentation. Thus, BL is better than or 12: s¯− = ∆; equal to the optimal result. If the proposed algorithm is close 13: if s¯ < λ then to BL, it must be close to the optimal results. 14: return FAIL; All algorithms are written in C and run on a Windows 15: else machine with a 2.5 GHz CPU and 8 GB of memory. 16: k + +; All test cases are generated based on the parameters shown 17: return ∀Mi, j, ∀wi, j,g; in Table I. Each test case includes n/2 switches and n/2 end systems. Each switch has 4 ports, one of which is connected to Then, Algorithm 3 fragments and schedules messages in the an end system, and the remaining ports can be connected to order they are in set Ω⃗ (Lines 3–16). Messages are fragmented other switches. The switch-end system pairs are randomly into packets of s¯ bytes (Line 4), and the range of s¯ is from located on a two-dimensional coordinate system. Then, the MSS to λ (Lines 2 and 12–14). If a packet cannot be switches are traversed in increasing order of ID. For each scheduled under the current s¯ (Line 9), s¯ is reduced by a given switch, the free ports are connected to the nearest switches step value ∆ (Line 12), and the messages that directly effect that still have free ports. To illustrate the universality of our the packet are re-fragmented and re-scheduled (Lines 10 and algorithms, we do not limit the network topology. There are f 11). The process of generating schedules (Lines 5–8) is the flows. The flows randomly select their source and destination same as that in Algorithm 1. end systems. Routing paths are generated using the Dijkstra Owing to the roll-back operation (Line 11), the time algorithm. To restrict the length of the hyperperiod, we adopt complexity of Algorithm 3 is pseudo-polynomial. The number harmonic periods. Each period is a random number that is in of rollbacks is up to MSS /∆, and the time complexity of the period range p and conforms to the expression 400 μs × 2n, lines 3–7 has been calculated in Algorithm 1. Therefore, the where n is a non-negative integer. Message sizes are random time complexity of Algorithm 3 is O(|F| × H × U × |N| × MSS / ∆ ∆ numbers in the message size range s. The deadline of a flow is ), where U and are given by users. also a random number in the range [pi/2, pi]. The transmission VI. Evaluation speed v = 31 bytes/μs is measured in our simple prototype In this section, we evaluate the proposed algorithms based network such that the logical time of test cases can be mapped on extensive test cases. Three metrics are used in our to the physical time. In the following comparisons, these evaluation: 1) schedulable ratio is the percentage of test cases parameters are changed and extensive test cases are generated for which an algorithm can find a feasible solution; 2) the to evaluate the efficiency and universality of the proposed JIN et al.: JOINT ALGORITHM OF MESSAGE FRAGMENTATION AND NO-WAIT SCHEDULING FOR TIME-SENSITIVE NETWORKS 487

TABLE I 300 s, it is difficult to solve these test cases in 600 s. The test Parameters cases in Fig. 10 (b) have short execution times because they n Number of network nodes include the minimum number of packets, i.e., the minimum f Number of flows number of variables. Therefore, if a problem includes a small p Period range number of variables, OMT can be adopted to find optimal

s Message size range results; otherwise, JA is the best choice. ∆ Step value to reduce packet size B. Evaluations Without OMT algorithm. In Section VI-A, we compare OMT with JA, ME To fully evaluate the proposed algorithm, in the following, and BL. Then, due to the long execution time of the OMT each parameter is changed over a larger range to generate test solver, in Section VI-B, except OMT, the other algorithms are cases. The basic setting is shown in the caption of Fig. 11.

fully evaluated. Then, Figs. 12–15 change the period parameter, size parameter, step value, and number of flows, respectively, to A. Evaluations With OMT illustrate the efficiency and universality of JA. For each The Microsoft solver Z3 [43] is invoked to solve the OMT parameter setting, 2000 test cases are generated. specification. The time limit of Z3 is set as 600 s. To make

n = f = p = { , } test cases solvable, 4, 4, and 400 μs 800 μs are 1.4 7 set. The other parameters are shown in Fig. 9. For each JA ME JA 1.2 BL ME+EN 6 BL parameter setting, 200 test cases are generated. Figs. 9 (a) and 1.0 ME+AD JA-EN 5 ME+AD 9(b) show schedulable ratios and the number of packets 0.8 4 JA-EN normalized with BL, respectively. Since ME adopts traditional 0.6 3 fragmentation, it does not introduce extra packets. However, 0.4 2 Schedulable ratio the traditional algorithm has the worst schedulable ratio. OMT 0.2 Number of packets 1 is not better than JA because Z3 cannot find optimal results 0 0 10 20 30 40 50 60 10 20 30 40 50 60 for some test cases within the time limit. The schedulable ratio n n (a) Schedulable ratio (b) Number of packets of JA is close to BL. This illustrates that JA can significantly improve the schedulability. Other than OMT, the other Fig. 11. Comparisons under varying n, p=[800 μs,6.4 ms], s=[1461,5480], algorithms can finish execution within 10 ms. The execution ∆ = 146, and f = n. time of OMT is shown in Fig. 10. Most of the test cases can be solved by Z3 in 300 s. Once the execution time exceeds 1.4 12 JA ME JA

1.2 BL ME+EN 10 BL 1.4 1.8 1.0 ME+AD JA-EN ME+AD 1.2 JA ME OMT BL JA ME OMT BL 8 1.6 0.8 JA-EN 1.0 6 0.8 1.4 0.6 4 0.6 0.4

1.2 Schedulable ratio Number of packets 2 0.4 0.2

Schedulable ratio 1.0 0.2 Number of packets 0 0 10 20 30 40 50 60 10 20 30 40 50 60 0 0.8 n n s = [1461, s = [1461, s = [1461, s = [1461, s = [1461, s = [1461, (a) Schedulable ratio (b) Number of packets 2920] 2920] 5840] 2920] 2920] 5840] Δ = 146 Δ = 292 Δ = 146 Δ = 146 Δ = 292 Δ = 292 = , . = , (a) Schedulable ratio (b) Number of packets Fig. 12. Comparisons under varying n, p [800 μs 12 8 ms], s [1461 , ∆ = , and f = n. 5480] 146 Fig. 9. Evaluations of OMT.

1.4 7 JA ME 600 600 600 JA 1.2 BL ME+EN 6 BL 500 500 500 1.0 ME+AD JA-EN 5 ME+AD 400 400 400 0.8 4 JA-EN 300 300 300 0.6 3 200 200 200 0.4 2 Schedulable ratio Number of packets

Execution time (s) 100 Execution time (s) 100 Execution time (s) 100 0.2 1 0 0 0 0 0 0 100 200 0 100 200 0 100 200 10 20 30 40 50 60 10 20 30 40 50 60 Test cases Test cases Test cases n n (a) Schedulable ratio (b) Number of packets (a) 1461≤ s ≤2920 (b) 1461≤ s ≤2920 (c) 1461≤ s ≤5840 Δ = 146 Δ = 292 Δ = 146 Fig. 13. Comparisons under varying n, p = [800 μss,6.4 ms], s = [1461, Fig. 10. Execution time of OMT. 8760], ∆ = 146, and f = n.

488 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 8, NO. 2, FEBRUARY 2021

1.4 7 JA ME JA schedulable ratios and the number of packets, as shown in 1.2 BL ME+EN 6 BL Fig. 12. Compared to Fig. 11(a), Fig. 12(a) shows higher 1.0 ME+AD JA-EN 5 ME+AD schedulable ratios because more time resources can be utilized JA-EN 0.8 4 to make flows satisfy real-time constraints. Then, since the 0.6 3 real-time constraints are easily satisfied, messages do not need 0.4 2 Schedulable ratio

Number of packets to be fragmented into small packets to reduce the delay. Thus, 0.2 1 in Fig. 12(b), the number of packets of JA is less than that in 0 0 10 20 30 40 50 60 10 20 30 40 50 60 Fig. 11(b). ME+AD makes full use of the time resources to n n (a) Schedulable ratio (b) Number of packets improve the schedulability but results in too many packets. Compared to ME+AD, JA not only guarantees the Fig. 14. Comparisons under varying n, p = [800 μs,6.4 ms], s = [1461, requirements of deterministic communications but also 5480], ∆ = 438, and f = n. reserves more resources for best-effort communications. In Fig. 13, the message size is larger than that of other

1.4 7 figures. Since large messages are difficult to schedule, the ME JA BL 1.2 JA 6 schedulable ratios in Fig. 13(a) are low. Compared to JA, BL ME+EN ME+AD JA-EN although ME+AD improves the schedulable ratio by 3%, it 1.0 ME+AD JA-EN 5 0.8 4 consumes more than twice the resources. JA-EN is close to JA 0.6 3 because large messages can be fragmented into more packets 0.4 2 so that JA-EN searchs fesabile solutions in a larger solution Schedulable ratio 0.2 Number of packets 1 space. When n ≥ 50, no algorithm can schedule test cases, not 0 0 even BL. Therefore, in real applications, the periods and 15 20 25 30 35 40 15 20 25 30 35 40 f f deadlines of large messages must be extended to make them (a) Schedulable ratio (b) Number of packets schedulable, as illustrated in Fig. 12.

In Fig. 14, ∆ is increased. Messages cannot be fragmented = , . = , Fig. 15. Comparison under varying f , p [800 μs 6 4 ms], s [1461 into small pieces. Both the schedulable ratio and the number 5480], ∆ = 146, and n = 20. of packets are reduced. Other than ∆, if the other parameters are changed, test cases are changed accordingly. Only ∆ is Fig. 11 shows the comparison with a varying the number of independent of test cases. Therefore, when given a real network nodes. The schedulable ratio decreases as the number network, we can adjust ∆ to trade off between the schedulable of nodes increases. This is because the greater the number of ratio and the number of packets. nodes, the more difficult the scheduling becomes. ME+EN is Fig. 15 shows the comparison under varying f . As f better than ME. This is because ME makes more resource increases, the increased packets make more test cases fragmentations unusable and cannot provide a long enough unschedulable, and thus the schedulable ratio decreases. The time duration to transmit no-wait packets. Fig. 11 (b) shows number of packets increases slowly, because the test cases that the number of packets normalized with BL. ME and ME+EN contain more packets are unschedulable, and these are not shown since they both adopt the traditional unschedulable test cases are not included in the figure. When fragmentation and have the same number of packets with BL. there are not many flows, the schedulable ratio of ME+EN is The larger the network, the longer the routing path. Thus, significantly higher than that of ME. Therefore, ME+EN is an when n = 60, to make flows schedulable, smaller packets are efficient method for low-load test cases. needed to improve the parallelism such that the delay introduced by the long path can be reduced. Therefore, the VII. Conclusions number of packets increases as n increases. Compared to ME, In this paper, to improve the real-time performance of although JA introduces more packets, it still can find feasible networked control systems, the joint problem of message solutions. This is because, in ME, different packet sizes cause fragmentation and no-wait scheduling is explored. First, an excessive resource fragmentations. ME+AD uses many small OMT specification is proposed and then an off-the-shelf packets to improve the parallelism. Thus, the schedulable ratio solver utilized to find its optimal solution. However, the of ME+AD is slightly better than JA, but the number of OMT-based algorithm is limited by the execution time of the packets of ME+AD is more than double that of JA. Therefore, off-the-shelf solver. Then, to improve the scalability of our in a network, if there are sufficient resources, the smaller MSS algorithm, a heuristic algorithm is proposed. The algorithm should be adopted to improve the real-time performance. constructs low-delay fragmentation and schedules based on Compared to JA, JA-EN causes more resource the message delay analysis. Finally, extensive test cases are fragmentations. Hence, even though part of messages are conducted to evaluate the proposed algorithms. The results fragmented into smaller packets to improve the parallelism, indicate that the proposed joint algorithm outperforms existing the schedulable ratio of JA-EN is still less than that of JA. ones. In the future, we will implement a TSN switch based on The period range is extended to illustrate its effect on FPGA and evaluate our algorithms in a real TSN. JIN et al.: JOINT ALGORITHM OF MESSAGE FRAGMENTATION AND NO-WAIT SCHEDULING FOR TIME-SENSITIVE NETWORKS 489

[2] J. Lee and S. Park, “Time-sensitive network (TSN) experiment in

sensorbased integrated environment for autonomous driving,” Sensors,

Appendix vol. 19, no. 5, pp. 1–11, May 2019.

Symbol [3] L. Huang, Y. Liang, Y. Zhang, Y. Wang, and Q. Wang, “Time-sensitive network technology and its application in energy internet, ” in Proc. N Node set IEEE Int. Conf. Energy Internet, 2019, pp. 211–216.

[4] J. Jiang, Y. Li, S. H. Hong, A. Xu, and K. Wang, “A time-sensitive L Link set networking (TSN) simulation model based on OMNET++, ” in Proc. ni The i-th node IEEE Int. Conf. Mechatronics and Automation, 2018, pp. 643–648.

[5] S. S. Craciunas, R. S. Oliver, M. Chmelík, and W. Steiner, “Scheduling li, j The link between n and n i j real-time communication in IEEE 802.1 Qbv time sensitive networks, ” F Flow set in Proc. 24th Int. Conf. Real-Time Networks and Systems, 2016, pp. 183–192. fi The i-th flow

[6] Cisco Systems, Inc. (2020) Cisco industrial Ethernet 4000, 4010 and pi Period of fi 5000 switch software configuration guide. [Online]. Available: di Relative deadline of fi https://www.cisco.com. Accessed on: May 20, 2020.

[7] N. Semiconductors. (2016) SJA1105 product data sheet. [Online]. si Message size of fi Available: https://www.nxp.com/docs/en/data-sheet/SJA1105.pdf. Acce- Π i Path of fi ssed on: May 20, 2020. , mi j The j-th message of fi [8] F. Dürr and N. G. Nayak, “No-wait packet scheduling for IEEE time- sensitive networks (TSN)” in Proc. 24th Int. Conf. Real-Time Networks H Hyperperiod and Systems, 2016, pp. 203–212. v Transmission speed

[9] X. Jin, F. Kong, L. Kong, W. Liu, and P. Zeng, “Reliability and ri Number of hops in path Πi temporality optimization for multiple coexisting WirelessHART networks in industrial environments,” IEEE Trans. Ind. Elect., vol. 64, U A message is fragmented into, at most, U packets. no. 8, pp. 6591–6602, 2017. τ , , i j g The g-th packet of mi, j [10] J. Tang, B. Shim, and T. Q. Quek, “Service multiplexing and revenue , , wi j g Injection time of τi, j,g maximization in sliced C-RAN incorporated with URLLC and multicast eMBB,” IEEE J. Sel. Area. Commun., vol. 37, no. 4, pp. 881–895, 2019. , , si j g Size of τi, j,g

[11] L. Kong, M. Xia, X. Y. Liu, G. Chen, Y. Gu, M. Y. Wu, and X. Liu, ui, j,g If ui, j,g = 1, τi, j,g is valid. “Data loss and reconstruction in wireless sensor networks,” IEEE Trans. e Size of a packet header Parall. and Distrib. Sys., vol. 25, no. 11, pp. 2818–2828, 2013.

[12] R. S. Oliver, S. S. Craciunas, and W. Steiner, “IEEE 802.1 Qbv gate Ai, j,g(la,b) Time of starting to transmit τ , , on l , i j g a b control list synthesis using array theory encoding, ” in Proc. IEEE Real- qi,a,b Location of la,b in Πi Time and Embedded Technology and Applications Symp., 2018, pp. ρ , 13–24. i j Priority of mi, j

[13] X. Jin, C. Xia, N. Guan, C. Xu, D. Li, Y. Yin, and P. Zeng, “Real-time Γ Unscheduled message set scheduling of massive data in time sensitive networks with a limited Mi, j Valid packet set of mi, j number of schedule entries,” IEEE Access, vol. 8, no. 1, pp. 6751–6767, 2020. t Time variable

[14] N. G. Nayak, F. Dürr, and K. Rothermel, “Incremental flow scheduling Ω(m , ) i j Conflict set (Definition 1) and routing in time-sensitive software-defined networks,” IEEE Trans. , S (mi j) Set of packets that belong to Ω(mi, j) Ind. Informat., vol. 14, no. 5, pp. 2066–2075, 2017.

{ ,..., } [15] M. L. Raagaard, P. Pop, M. Gutiérrez, and W. Steiner, “Runtime l1 lri Links in Π i reconfiguration of time-sensitive networking (TSN) schedules for fog φ Number of links in Πi ∩ Πa computing, ” in Proc. IEEE Fog World Congress, 2017, pp. 1–6. lx y The xy-th link in Πi ∩ Πa, and y ∈ (1,φ] [16] V. Gavriluţ, L. Zhao, M. L. Raagaard, and P. Pop, “AVB-aware routing ⃗ and scheduling of time-triggered traffic for TSN,” IEEE Access, vol. 6, S (mi, j) Set of ordered packet no. 11, pp. 75 229–75 243, 2018. σ Number of packets in S (mi, j) [17] J. Falk, F. Dürr, and K. Rothermel, “Exploring practical limitations of ς Number of packet in Mi, j joint routing and scheduling for TSN with ILP, ” in Proc. 24th Int. Conf. on Embedded and Real-Time Computing Systems and {τ ,...,τ } ⃗ 1 σ+ς Packets in S (mi, j) Applications, 2018, pp. 136–146.

ε A value close to 0 [18] A. Nasrallah, A. S. Thyagaturu, Z. Alharbi, C. Wang, X. Shao, M. Reisslein, and H. Elbakoury, “Performance comparison of IEEE 802.1 smax The maximum size in S (mi, j) TSN time aware shaper (TAS) and asynchronous traffic shaper (ATS),” ′ s¯, s Size variables IEEE Access, vol. 7, no. 4, pp. 44 165–44 181, 2019.

[19] L. Zhao, P. Pop, Z. Zheng, and Q. Li, “Timing analysis of AVB traffic λ The minimum packet size in TSN networks using network calculus, ” in Proc. IEEE Real-Time χ i, j The deviation between delay and deadline and Embedded Technology and Applications Symp., 2018, pp. 25–36.

⃗ Ω Set of sorted messages [20] I. Ferretti and L. Zavanella, “Batch energy scheduling problem with nowait/blocking constraints for the general flow-shop problem,” ∆ A step value used to adjust packet sizes Procedia Manufacturing, vol. 42, no. 4, pp. 273–280, 2020.

[21] F. Della Croce, A. Grosso, and F. Salassa, “Minimizing total completion References time in the two-machine no-idle no-wait flow shop problem,” J. Heuristics, vol. 19, no. 10, pp. 1–15, 2019.

[1] P. Pop, M. L. Raagaard, M. Gutierrez, and W. Steiner, “Enabling fog

[22] X. Wang, K. Xing, Y. Feng, and Y. Wu, “Scheduling of flexible computing for industrial automation through time-sensitive networking manufacturing systems subject to no-wait constraints via Petri nets and (TSN),” IEEE Commun. Stand. Mag., vol. 2, no. 2, pp. 55–61, Jul. 2018. heuristic search,” IEEE Trans. Syst., Man, and Cybern.: Syst., vol. 49, 490 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 8, NO. 2, FEBRUARY 2021

no. 12, pp. 1–12, 2019. [40] J. Liu, Real-time Systems. Upper Saddle River, USA: Prentice Hall,

[23] M. Behnam, R. Marau, and P. Pedreiras, “Analysis and optimization of 2000. the MTU in real-time communications over switched Ethernet, ” in

[41] N. C. Audsley, “On priority asignment in fixed priority scheduling,” Inf. Proc. 16th Conf. Emerging Technologies and Factory Automation, 2011, pp. 1–7. Process. Lett., vol. 79, no. 1, pp. 39–44, 2001.

[24] M. Ashjaei, M. Behnam, L. Almeida, and T. Nolte, “MTU configuration [42] A. Saifullah, Y. Xu, C. Lu, and Y. Chen, “End-to-end delay analysis for for real-time switched Ethernet networks,” J. Syst. Architect., vol. 70, fixed priority scheduling in WirelessHART networks, ” in Proc. 17th no. 4, pp. 15–25, 2016. IEEE Real-Time and Embedded Technology and Applications Symp.,

[25] K. B. Gemlau, J. Peeck, N. Sperling, P. Hertha, and R. Ernst, “A new 2011, pp. 13–22. design for data-centric Ethernet communication with tight synchronization requirements for automated vehicles, ” in Proc. 45th [43] N. Bjorner and A. D. Phan, “vZ – maximal satisfaction with Z3, ” in Ann. Conf. IEEE Industrial Electronics Society, 2019, pp. 4489–4494. Proc. 6th Int. Symp. Symbolic Computation in Software Science, 2014, pp. 1–9. [26] N. Chaudhari, K. C. Ananthoju, and S. Isac, “Comparative analysis of large data transfer in automotive applications using Ethernet switched networks, ” in Proc. Symp. Int. Automotive Technology, 2019. Xi Jin (M’17) received the Ph.D. degree in computer

[27] B. Shin, J. Abdullayev, and D. Lee, “An efficient MAC layer packet science from Northeastern University, China in 2013. fragmentation scheme with priority queuing for real-time video She is currently an Associate Professor with the streaming, ” in Proc. IEEE 41st Conf. Local Computer Networks, 2016, Shenyang Institute of Automation, Chinese Academy pp. 69–77. of Sciences, China. She is also a committee member of the China Computer Federation Technical [28] J. Abdullayev, B. Shin, and D. Lee, “A dynamic packet fragmentation extension to high throughput WLANs for real-time H264/AVC video Committee on Embedded Systems (CCF TCEBS). streaming, ” in Proc. 10th Int. Conf. Future Internet, 2015, pp. 1–4. Her research interests include industrial networks, real-time systems, and internet of things.

[29] I. Suciu, X. Vilajosana, and F. Adelantado, “An analysis of packet fragmentation impact in LPWAN, ” in Proc. IEEE Wireless Communications and Networking Conf., 2018, pp. 1–6. Changqing Xia (M’17) received the Ph. D. degree [30] S. A. Awwad, N. K. Noordin, B. M. Ali, F. Hashim, and N. H. A. Ismail, “6LoWPAN route-over with end-to-end fragmentation and from Northeastern University, China in 2015. He is reassembly using cross-layer adaptive backoff exponent,” Wirel. Pers. currently an Associate Professor at Shenyang Commun., vol. 98, no. 1, pp. 1029–1053, 2018. Institute of Automation, Chinese Academy of Sciences, China. His research interests include [31] X. Bao, Y. Zhang, D. Guo, and M. Song, “An optimization model for wireless sensor networks, edge computing and real- fragmentation-based routing in delay tolerant networks,” Science China time systems, especially the real-time scheduling Information Sciences, vol. 59, no. 1, pp. 1–16, 2016. algorithms, and smart energy systems.

[32] N. Naaman and R. Rom, “Packet scheduling with fragmentation, ” in Proc. 21st Ann. Joint Conf. IEEE Computer and Communications Societies, 2002, pp. 427–436.

[33] E. Suethanuwong, “Message fragmentation of event-triggered traffic in Nan Guan received the B.E. and M.S. degrees from TTEthernet systems using the timely block method, ” in Proc. Int. Conf. Northeastern University, China in 2003 and 2006, Computational Techniques in Information and Communication respectively, and the Ph.D. degree from Uppsala Technologies, 2016, pp. 450–458. University, Sweden in 2013. He is currently an

[34] N. Semiconductors. (2017) Software user manual for SJA1105TEL. Assistant Professor at the Department of Computing, [Online]. Available: https://www.nxp.com/docs/en/user-guide/UM10944. the Hong Kong Polytechnic University, China. His pdf. Accessed on: May 20, 2020. research interests include real-time embedded systems and cyber-physical systems. He received the

[35] S. M. Laursen, P. Pop, and W. Steiner, “Routing optimization of AVB European Design and Automation Association streams in TSN networks,” ACM Sigbed Review, vol. 13, no. 4, (EDAA) Outstanding Dissertation Award in 2014, pp. 43–48, 2016. the Best Paper Award of Real-Time Systems Symposium (RTSS) 2009 and [36] V. Gavrilut, B. Zarrin, P. Pop, and S. Samii, “Fault-tolerant topology Design, Automation, and Test in Europe (DATE) 2013. and routing synthesis for IEEE time-sensitive networking, ” in Proc. 25th Int. Conf. Real-Time Networks and Systems, 2017, pp. 267–276.

[37] P. Jayachandran and T. Abdelzaher, “A delay composition theorem for Peng Zeng received the Ph.D. degree from the real-time pipelines, ” in Proc. 19th Eur. Conf. Real-Time Systems, 2007, Shenyang Institute of Automation, Chinese Academy pp. 29–38. of Sciences, China in 2005. He is currently a Professor at the Shenyang Institute of Automation, [38] A. Cimatti, A. Franzen, A. Griggio, R. Sebastiani, and C. Stenico, Chinese Academy of Sciences, China. His research “Satisfiability modulo the theory of costs: foundations and interests include industrial communication and applications,” in Proc. Int. Conf. Tools and Algorithms for the wireless sensor networks. Construction and Analysis of Systems, 2010, pp. 99–113.

[39] E. B. Clark, T. A. Henzinger, H. Veith, and R. Bloem, “Satisfiability modulo theories, ” Handbook of Model Checking, pp. 305–343, 2018.