Performance Analysis of Networks on Chips

Total Page:16

File Type:pdf, Size:1020Kb

Performance Analysis of Networks on Chips Performance analysis of networks on chips Citation for published version (APA): Beekhuizen, P. (2010). Performance analysis of networks on chips. Technische Universiteit Eindhoven. https://doi.org/10.6100/IR657033 DOI: 10.6100/IR657033 Document status and date: Published: 01/01/2010 Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim. Download date: 29. Sep. 2021 Performance Analysis of Networks on Chips THOMAS STIELTJES INSTITUTE FOR MATHEMATICS CIP-DATA LIBRARY TECHNISCHE UNIVERSITEIT EINDHOVEN Beekhuizen, Paul Performance analysis of networks on chips / by Paul Beekhuizen – Eindhoven : Technische Universiteit Eindhoven, 2009. Proefschrift. – ISBN 978-90-386-2144-9 NUR 919 Subject headings : Queueing Theory / Networks on Chips 2010 Mathematics Subject Classification : 60K25, 68M20, 90B18 Printed by Eindhoven University of Technology Printservice. Cover design by Paul Verspaget. Performance Analysis of Networks on Chips proefschrift ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de rector magnificus, prof.dr.ir. C.J. van Duijn, voor een commissie aangewezen door het College voor Promoties in het openbaar te verdedigen op donderdag 4 februari 2010 om 16.00 uur door Paul Beekhuizen geboren te Rotterdam Dit proefschrift is goedgekeurd door de promotor: prof.dr.ir. O.J. Boxma Copromotor: dr. J.A.C. Resing Acknowledgements There are many people who, in one way or another, made it possible for me to complete this thesis, and I would like to thank a few of them in particular. First of all, I would like to thank my daily supervisor and co-promotor Jacques Resing. We worked together on most of the research in this thesis, and I enjoyed our cooperation very much. I have especially benefitted a lot from Jacques’ continuous search for the simplest non-trivial model, which proved insightful many times. Next, I would like to thank Dee Denteneer from Philips Research for his guidance and supervision during my period as a PhD and an MSc student. Dee has had a prominent influence on my research over the last five years and I doubt that, without his efforts, I would have pursued a PhD. I also wish to thank Onno Boxma for his supervision. Onno somehow manages to find time for anyone who can benefit from his help and he provided me with valuable feedback on multiple occasions, for which I am very grateful. I am very grateful as well for the funding I received from Philips Research and for the freedom that was given to me. Thanks go out to my fellow PhD students at Philips for making my stay there a pleasant one. During my period as a PhD student, I had an additional affiliation with Eu- random, and I would like to thank everyone at Eurandom for the great time I had there. Eurandom has been a very lively and socially active environment, and working in such an environment seems almost indispensable for the successful completion of a PhD thesis. I will greatly miss the countless diners, poker nights, football and foosball matches, coffee breaks, social excursions, Eurandom lunches, and all other events which I cannot think of at the moment. As there cannot be a PhD defense without a defense committee a word of thanks is due to its president and its members: Ivo Adan, Harm Dorren, Rob van der Mei, Sindo N´u˜nez Queija, and my three supervisors. Ivo is also one of the co-authors of [13] and I wish to thank him for our cooperation on that paper as well. On a more personal note, I owe thanks to (all) my parents, the rest of my family, and my friends for their support and interest. Finally, I thank my wife Ivonne for her unconditional trust in my abilities and for supporting me in whatever I choose to do. v Contents 1 Introduction 1 1.1 Networksonchips ............................ 2 1.1.1 Qualityofservice......................... 3 1.1.2 Flowcontrol ........................... 4 1.1.3 Networktopologies.. .. .. .. .. .. .. .. .. .. 5 1.2 Switches.................................. 6 1.2.1 Bufferingstrategies. 7 1.2.2 Throughput............................ 9 1.2.3 Switchesinnetworksonchips . 11 1.3 Queueingtheory ............................. 11 1.3.1 Generalqueueingsystems . 11 1.3.2 Arrival models in discrete-time queueing systems . .... 13 1.4 Models................................... 16 1.4.1 Single-switchmodels . 16 1.4.2 Concentrating tree networks of polling stations . .... 19 1.5 Keyresultsandorganisationofthethesis . ... 22 2 Uniform switches 25 2.1 Model ................................... 26 2.2 Approximations for K =1........................ 27 2.2.1 TheKHMapproximation . 28 2.2.2 TheKKLapproximation . 29 2.2.3 Geometricapproximation . 29 2.3 Service time approximation for K > 1 ................. 31 2.4 Networkanalysis ............................. 32 2.4.1 Arrivalsattheswitch . 33 2.4.2 ArrivalsattheNI ........................ 33 2.5 Approximationcomparison . 34 2.6 Validation of the geometric distribution . .... 38 2.7 Conclusion ................................ 40 3 Non-uniform switches 41 3.1 Model ................................... 42 vii viii Table of contents 3.2 Saturatedswitch ............................. 42 3.3 Stabilityconditions............................ 43 3.4 Throughput................................ 47 3.5 Waitingtimeandservicerate . 50 3.5.1 Waitingtime ........................... 51 3.5.2 Servicerate............................ 51 3.6 Analysisoftherunningexample . 55 3.6.1 Throughput and stability conditions . 56 3.6.2 Servicerateandmeanwaitingtime . 57 3.7 Correlatedtraffic ............................. 60 3.7.1 Correlatedarrivals . 60 3.7.2 Correlateddestinations . 61 3.8 Numericalanalysis ............................ 62 3.9 Conclusion ................................ 67 4 Reduction of polling tree networks 69 4.1 Introduction................................ 70 4.2 Formalisation ............................... 72 4.2.1 Theoriginalsystem . .. .. .. .. .. .. .. .. .. 72 4.2.2 Thereducedsystem . .. .. .. .. .. .. .. .. .. 75 4.3 Proofofthemainresults. .. .. .. .. .. .. .. .. .. .. 76 4.3.1 Thecoupledsystem . .. .. .. .. .. .. .. .. .. 76 4.3.2 Theoriginalandthecoupledsystem . 77 4.3.3 Thereducedandthecoupledsystem . 80 4.3.4 Waitingtimes........................... 82 4.4 Discussion................................. 83 4.5 Conclusion ................................ 84 5 End-to-end delays in polling tree networks 85 5.1 Model ................................... 86 5.2 Analysisofthetree............................ 88 5.2.1 Overallend-to-enddelay. 88 5.2.2 End-to-enddelaypertype. 89 5.3 AccuracyofApproximation5.2.2 . 91 5.4 Applicationtonetworksonchips . 93 5.4.1 Description ............................ 93 5.4.2 Analysis.............................. 94 5.4.3 Numericalresults. .. .. .. .. .. .. .. .. .. .. 96 5.5 Exactresults ............................... 98 5.5.1 Symmetricstations. 98 5.5.2 Symmetrictrees ......................... 99 5.6 Conclusion ................................ 100 6 Polling systems with Bernoulli service and Markovian routing 103 6.1 Background................................ 104 Table of contents ix 6.2 Modeldescription............................. 105 6.3 Relevantliterature . .. .. .. .. .. .. .. .. .. .. .. 106 6.4 Theapproximation. .. .. .. .. .. .. .. .. .. .. .. 107 6.5 Numericalresults............................. 112 6.6 Large-scalenumericalstudy . 115 6.7 Implementation with Kronecker products . 118 6.8 Conclusion ................................ 121 7 Polling tree networks with flow control 123 7.1 Model ................................... 124 7.2 Markovchainanalysis . 125 7.3 Throughputcomputation . 129 7.4 Bernoulliserviceand Markovianrouting . 130 7.5 Numericalanalysis ............................ 134 7.6 Conclusion ................................ 136 References 139 Summary 151 About the author 153 x Table of contents Chapter 1 Introduction Modules on a chip, such as processors and memories,
Recommended publications
  • Instability Phenomena in Underloaded Packet Networks with Qos Schedulers M.Ajmone Marsan, M.Franceschinis, E.Leonardi, F.Neri, A.Tarello
    Instability Phenomena in Underloaded Packet Networks with QoS Schedulers M.Ajmone Marsan, M.Franceschinis, E.Leonardi, F.Neri, A.Tarello Abstract— Instability in packet-switching networks is normally The first results showing that instability can happen also in associated with overload conditions, since queueing network mod- underloaded queueing networks1 started to appear less than a els show that, in simple configurations, only overload generates decade ago [9], [10], when some classes of queueing networks instability. However, some results showing that instability can happen also in underloaded queueing networks appeared in the were identified, for which underload does not automatically recent literature. Underload instabilities can be produced by guarantee stability, i.e., for which the backlog at some queue complex scheduling algorithms, that bear significant resemblance in the network can indefinitely grow also when such queue is to the Quality of Service (QoS) schedulers considered today for not overloaded. It is important to observe that these underload packet networks. In this paper, we study with fluid models and instabilities are often produced either by customer routes that with adversarial queueing theory possible underload instabilities due to strict-priority schedulers and to Generalized Processor visit several times the same queues, or by variations of the Sharing (GPS) schedulers. customer service times at the different queues, or by complex scheduling algorithms. The first hints to the possible connections between underload
    [Show full text]
  • Scalable Multi-Module Packet Switches with Quality of Service
    Scalable Multi-module Packet Switches with Quality of Service Santosh Krishnan Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2006 c 2006 Santosh Krishnan All Rights Reserved ABSTRACT Scalable Multi-module Packet Switches with Quality of Service Santosh Krishnan The rapid growth in packet-based network traffic has resulted in a growing demand for network switches that can scale in capacity with increasing interface transmission rates and higher port counts. Furthermore, the continuing migration of legacy circuit-switched services to a shared IP/MPLS packet-based network requires such network switches to provide an adequate Quality of Service (QoS) in terms of traffic prioritization, as well as bandwidth and delay guarantees. While technology advances, such as the usage of faster silicon and optical switching components, pro- vide one dimension to address this demand, architectural improvements provide the other. This dissertation addresses the latter topic. Specifically, we address the subject of constructing and analyzing high-capacity QoS-capable packet switches using multiple lower-capacity modules. Switches with the output-queueing (OQ) discipline, in theory, provide the best perfor- mance in terms of throughput as well as QoS, but do not scale in capacity with increasing rates and port counts. Input-queued (IQ) switches, on the other hand, scale better but require complex arbitration procedures, sometimes impractical, to achieve the same level of performance. We leverage the state-of-the-art in OQ and IQ switching systems and establish a new taxonomy for a class of three-stage packet switches, which we call Buffered Clos Switches.
    [Show full text]
  • Improving the Scalability of High Performance Computer Systems
    IMPROVING THE SCALABILITY OF HIGH PERFORMANCE COMPUTER SYSTEMS Inauguraldissertation zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften der Universität Mannheim vorgelegt von Heiner Hannes Litz aus Heidelberg (Diplom-Informatiker der Technischen Informatik) Mannheim, 2010 Dekan: Professor Dr. W. Effelsberg, Universität Mannheim Referent: Professor Dr. U. Brüning, Universität Heidelberg Korreferent: Professor Dr. R. Männer, Universität Heidelberg Tag der mündlichen Prüfung: 17. März 2011 Für Regine Abstract Improving the performance of future computing systems will be based upon the ability of increasing the scalability of current technology. New paths need to be explored, as operating principles that were applied up to now are becoming irrelevant for upcoming computer architectures. It appears that scaling the number of cores, processors and nodes within an system represents the only feasible alternative to achieve Exascale performance. To accomplish this goal, we propose three novel techniques addressing different layers of computer systems. The Tightly Coupled Cluster technique significantly improves the communication for inter node communication within compute clusters. By improving the latency by an order of magnitude over existing solutions the cost of communication is considerably reduced. This enables to exploit fine grain parallelism within applications, thereby, extending the scalability considerably. The mechanism virtually moves the network interconnect into the processor, bypassing the latency of the I/O interface and rendering protocol conversions unnecessary. The technique is implemented entirely through firmware and kernel layer software utilizing off-the-shelf AMD processors. We present a proof-of-concept implementation and real world benchmarks to demonstrate the superior performance of our technique. In particular, our approach achieves a software-to- software communication latency of 240 ns between two remote compute nodes.
    [Show full text]
  • Fair Scheduling in Input-Queued Switches Under Inadmissible Traffic
    Fair Scheduling in Input-Queued Switches under Inadmissible Traffic Neha Kumar, Rong Pan, Devavrat Shah Departments of EE & CS Stanford University {nehak, rong, devavrat}@stanford.edu Abstract—In recent years, several high-throughput low-delay scheduling achieve 100% throughput. These results focus on admissible algorithms have been designed for input-queued (IQ) switches, assuming traffic conditions however, when in practice traffic is frequently admissible traffic. In this paper, we focus on queueing systems that violate admissibility criteria. inadmissible. Herein lies our motivation for this paper, where We show that in a single-server system with multiple queues, the Longest we study scheduling policies for inadmissible traffic conditions Queue First (LQF) policy disallows a fair allocation of service rates 1.We in IQ switches. also describe the duality shared by LQF’s rate allocation and a fair rate allocation. In general, we demonstrate that the rate allocation performed Under admissible traffic, stable scheduling algorithms grant by the Maximum Weight Matching (MWM) scheduling algorithm in over- every flow its desired service, and there does not arise a need loaded IQ switches is unfair. We attribute this to the lack of coordination for fairness in rate allocation 4. Under inadmissible traffic, not between admission control and scheduling, and propose fair scheduling al- all flows can receive desired service. We observe the rate allo- gorithms that minimize delay for non-overloaded queues. Keywords— Congestion Control, Quality of Service and Scheduling, cations performed by LQF and MWM in such a scenario, and Stochastic Processes and Queueing Theory, Switches and Switching, Re- prove that they lack fairness.
    [Show full text]
  • Scheduling Algorithm and Evaluating Performance of a Novel 3D-VOQ Switch
    IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.3A, March 2006 25 Scheduling Algorithm and Evaluating Performance of a Novel 3D-VOQ switch Ding-Jyh Tsaur†, Hsuan-Kuei Cheng††, Chia-Lung Liu††, and Woei Lin †† †Chin Min Institute of Technology, Mioali, 351 Taiwan, R.O.C. ††National Chung-Hsing University, Taichung, 402 Taiwan, R.O.C. Summary IQ switching is head-of-line (HOL) blocking, which can This paper studies scheduling algorithms and evaluates the severely affect the throughput. If each input maintains a performance of high-speed switching systems. A novel single FIFO, then HOL blocking can reduce the architecture for three-dimensional Virtual Output Queue (3D- throughput to only 58.6% [10]. Restated, HOL blocking VOQ) switches is proposed with a suitable scheduling algorithm can be eliminated entirely using a method known as virtual to improve the competitive transfer of service. This 3D-VOQ output queueing in which each input maintains a separate switch, which exactly emulates an output-queued switch with a broad class of service scheduling algorithms, requires no speedup, queue for each output. The throughput of an IQ switch can independently of its incoming traffic pattern and switch size. be increased up to 100% for independent arrivals [9]. First, an N×N 3D-VOQ switch is proposed. In this contention- Input buffered switches with VOQ can achieve 100% free architecture, the head-of-line problems are eliminated using throughput [9,11], thus specifying the relationship of a few virtual output queues (VOQ) from input ports and the proper scheduling with high speed.
    [Show full text]
  • On Scheduling Input Queued Cell Switches
    New Jersey Institute of Technology Digital Commons @ NJIT Dissertations Electronic Theses and Dissertations Spring 5-31-1999 On scheduling input queued cell switches Shizhao Li New Jersey Institute of Technology Follow this and additional works at: https://digitalcommons.njit.edu/dissertations Part of the Electrical and Electronics Commons Recommended Citation Li, Shizhao, "On scheduling input queued cell switches" (1999). Dissertations. 986. https://digitalcommons.njit.edu/dissertations/986 This Dissertation is brought to you for free and open access by the Electronic Theses and Dissertations at Digital Commons @ NJIT. It has been accepted for inclusion in Dissertations by an authorized administrator of Digital Commons @ NJIT. For more information, please contact [email protected]. Copyright Warning & Restrictions The copyright law of the United States (Title 17, United States Code) governs the making of photocopies or other reproductions of copyrighted material. Under certain conditions specified in the law, libraries and archives are authorized to furnish a photocopy or other reproduction. One of these specified conditions is that the photocopy or reproduction is not to be “used for any purpose other than private study, scholarship, or research.” If a, user makes a request for, or later uses, a photocopy or reproduction for purposes in excess of “fair use” that user may be liable for copyright infringement, This institution reserves the right to refuse to accept a copying order if, in its judgment, fulfillment of the order would
    [Show full text]
  • Switch Kenji Yoshigoe University of South Florida
    University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 8-9-2004 Design and Evaluation of the Combined Input and Crossbar Queued (CICQ) Switch Kenji Yoshigoe University of South Florida Follow this and additional works at: https://scholarcommons.usf.edu/etd Part of the American Studies Commons Scholar Commons Citation Yoshigoe, Kenji, "Design and Evaluation of the Combined Input and Crossbar Queued (CICQ) Switch" (2004). Graduate Theses and Dissertations. https://scholarcommons.usf.edu/etd/1313 This Dissertation is brought to you for free and open access by the Graduate School at Scholar Commons. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Scholar Commons. For more information, please contact [email protected]. Design and Evaluation of the Combined Input and Crossbar Queued (CICQ) Switch by Kenji Yoshigoe A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science and Engineering Department of Computer Science and Engineering College of Engineering University of South Florida Major Professor: Kenneth J. Christensen, Ph.D. Tapas K. Das, Ph.D. Miguel A. Labrador, Ph.D. Rafael A. Perez, Ph.D. Stephen W. Suen, Ph.D. Date of Approval: August 9, 2004 Keywords: Performance Evaluation, Packet switches, Variable-length packets, Stability, Scalability © Copyright 2004, Kenji Yoshigoe Acknowledgements I would like to express my gratitude to my advisor Dr. Kenneth J. Christensen for providing me tremendous opportunities and support. He has opened the door for me to pursue an academic career, and has mentored me to a great extent.
    [Show full text]
  • Quantifiable Service Differentiation for Packet Networks
    Quantifiable Service Differentiation for Packet Networks A Dissertation Presented to the Faculty of the School of Engineering and Applied Science University of Virginia In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Computer Science by Nicolas Christin August 2003 c Copyright 2003 Nicolas Christin All rights reserved Approvals This dissertation is submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Computer Science Nicolas Christin Approved: Jorg¨ Liebeherr (Advisor) John A. Stankovic (Chair) Tarek F. Abdelzaher Stephen D. Patek (Minor Representative) Victor Firoiu Alfred C. Weaver Accepted by the School of Engineering and Applied Science: Richard W. Miksad (Dean) August 2003 Abstract In this dissertation, we present a novel service architecture for the Internet, which reconciles ap- plication demand for strong service guarantees with the need for low computational overhead in network routers. The main contribution of this dissertation is the definition and realization of a new service, called Quantitative Assured Forwarding, which can offer absolute and relative differentia- tion of loss, service rates, and packet delays to classes of traffic. We devise and analyze mechanisms that implement the proposed service, and demonstrate the effectiveness of the approach through analysis, simulation and measurement experiments in a testbed network. To enable the new service, we introduce a set of new traffic control algorithms for network routers. The main mechanism proposed in this dissertation uses a novel technique that performs active buffer management (through dropping of traffic) and rate allocation (for scheduling) in a single step. This is different from prior work which views dropping and scheduling as orthogonal tasks.
    [Show full text]
  • Fabric-On-A-Chip: Toward Consolidating Packet Switching Functions on Silicon
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 12-2007 Fabric-on-a-Chip: Toward Consolidating Packet Switching Functions on Silicon William B. Matthews University of Tennessee - Knoxville Follow this and additional works at: https://trace.tennessee.edu/utk_graddiss Part of the Computer Engineering Commons Recommended Citation Matthews, William B., "Fabric-on-a-Chip: Toward Consolidating Packet Switching Functions on Silicon. " PhD diss., University of Tennessee, 2007. https://trace.tennessee.edu/utk_graddiss/239 This Dissertation is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council: I am submitting herewith a dissertation written by William B. Matthews entitled "Fabric-on-a- Chip: Toward Consolidating Packet Switching Functions on Silicon." I have examined the final electronic copy of this dissertation for form and content and recommend that it be accepted in partial fulfillment of the equirr ements for the degree of Doctor of Philosophy, with a major in Computer Engineering. Itamar Elhanany, Major Professor We have read this dissertation and recommend its acceptance: Gregory Peterson, Donald W. Bouldin, Bradley Vander Zanden Accepted for the Council: Carolyn R. Hodges Vice Provost and Dean of the Graduate School (Original signatures are on file with official studentecor r ds.) To the Graduate Council: I am submitting herewith a dissertation written by William B. Matthews entitled "Fabric- on-a-Chip: Toward Consolidating Packet Switching Functions on Silicon".
    [Show full text]
  • Data Path Processing in Fast Programmable Routers
    Data Path Processing in Fast Programmable Routers Pradipta De Computer Science Department Stony Brook University Stony Brook, NY 11794-4400 [email protected] Abstract Internet is growing at a fast pace. The link speeds are surging toward 40 Gbps with the emergence of faster link technologies. New applications are coming up which require intelligent processing at the intermediate routers. Switches and routers are becoming the bottlenecks in fast communication. On one hand faster links deliver more packets every second and on the other hand intelligent processing consumes more CPU cycles at the router. The conflicting goals of providing faster but computationally expensive processing call for new approaches in designing routers. This survey takes a look at the core functionalities, like packet classification, buffer memory manage- ment, switch scheduling and output link scheduling performed by a router in its data path processing and discusses the algorithms that aim to reduce the performance bound for these operations. An important requirement for the routers is to provide Quality of Service guarantees. We propose an algorithm to guarantee QoS in Input Queued Routers. The hardware solution to speed up router operation was Application Specific Integrated Circuits (ASICs). But the inherent inflexibility of the method is a demerit as network standards and application requirements are constantly evolving, which seek a faster turnaround time to keep up with the changes. The promise of Network Processors (NP) is the flexibility of general- purpose processors together with the speed of ASICs. We will study the architectural choices for the design of Network Processors and focus on some of the commercially available NPs.
    [Show full text]
  • Msc THESIS Design of a High-Performance Buffered Crossbar Switch Fabric Using Network on Chip
    2008 Computer Engineering Mekelweg 4, 2628 CD Delft The Netherlands http://ce.et.tudelft.nl/ MSc THESIS Design of a High-Performance Buffered Crossbar Switch Fabric Using Network on Chip Iria Varela Sen´ın Abstract High-performance routers constitute the basic building blocks of the Internet. The wide majority of today’s high-performance routers are designed using a crossbar fabric switch as interconnect topology. The buffered crossbar (CICQ) switching architecture is known to be the best crossbar-based architecture for routers design. However, CICQs require expensive on-chip buffers whose cost grows quadratically with CE-MS-2008-19 the router port count. Additionally, they use long wires to connect router inputs to outputs, resulting in non-negligible delays. In this thesis, we propose a novel design for the CICQ switch architecture. We design the whole buffered crossbar fabric as a Network on Chip (NoC). We propose two architectural variants. The first is named the Unidirectional NoC (UDN), where the crossbar core is built using a NoC with input and output ports placed on two opposite sides of the fabric chip. Because of the chip pins layout, we improved the UDN architecture using a Multidirectional NoC (MDN) architecture, by placing the inputs and output around the perimeter (four sides) of the NoC-based crossbar fabric. Both proposed architectures have been analyzed with appropriate routing algorithms for both unicast and multicast traffic conditions. Our results show that the proposed NoC based crossbar switching design outperforms the conventional CICQ architecture. Our designs offers several advantages when compared to traditional CICQ design: 1)Speedup, because short wires allow reliable high-speed signalling, and simple local arbitration per on- chip router.
    [Show full text]
  • High Performance Queueing and Scheduling in Support of Multicasting in Input-Queued Switches
    Old Dominion University ODU Digital Commons Electrical & Computer Engineering Theses & Dissertations Electrical & Computer Engineering Summer 2006 High Performance Queueing and Scheduling in Support of Multicasting in Input-Queued Switches Weiying Zhu Old Dominion University Follow this and additional works at: https://digitalcommons.odu.edu/ece_etds Part of the Computer Engineering Commons, and the Electrical and Computer Engineering Commons Recommended Citation Zhu, Weiying. "High Performance Queueing and Scheduling in Support of Multicasting in Input-Queued Switches" (2006). Doctor of Philosophy (PhD), Dissertation, Electrical & Computer Engineering, Old Dominion University, DOI: 10.25777/3hyz-5p33 https://digitalcommons.odu.edu/ece_etds/148 This Dissertation is brought to you for free and open access by the Electrical & Computer Engineering at ODU Digital Commons. It has been accepted for inclusion in Electrical & Computer Engineering Theses & Dissertations by an authorized administrator of ODU Digital Commons. For more information, please contact [email protected]. HIGH PERFORMANCE QUEUEING AND SCHEDULING IN SUPPORT OF MULTICASTING IN INPUT-QUEUED SWITCHES by Weiying Zhu B.S. July 1996, Xi’an Jiaotong University M.S. June 1999, Huazhong University of Science and Technology A Dissertation Submitted to the Faculty of Old Dominion University in Partial Fulfillment of the Requirement for the Degree of DOCTOR OF PHILOSOPHY ELECTRICAL AND COMPUTER ENGINEERING OLD DOMINION UNIVERSITY August 2006 Approved by: Min Song (Director) James FL Leathrum, Jr. (Member) Hussein Abdel-Wahab (Member) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTRACT HIGH PERFORMANCE QUEUEING AND SCHEDULING IN SUPPORT OF MULTICASTING IN INPUT-QUEUED SWITCHES Weiying Zhu Old Dominion University, 2006 Director: Dr.
    [Show full text]