US 2011 O142067A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2011/0142067 A1 JEHL et al. (43) Pub. Date: Jun. 16, 2011

(54) DYNAMIC LINK CREDIT SHARING IN QPI Publication Classification (76) Inventors: Timothy J. JEHL, Gilbert, AZ (51) Int. Cl. (US); Pradeepsunder Ganesh, HO4, 3/02 (2006.01) Chandler, AZ (US); Aimee Wood, (52) U.S. Cl...... 370/462 Tigard, OR (US); Robert Safranek, Portland, OR (US); John A. Miller, (57) ABSTRACT Portland, OR (US): Selim Bilgin, A method and system for dynamic credit sharing in a quick Hillsboro, OR (US); Osama path interconnect link. The method including dividing incom Neiroukh, Jerusalem (IL) ing credit into a first credit pool and a second credit pool; and allocating the first credit pool for a first data traffic queue and (21) Appl. No.: 12/639,556 allocating the second credit pool for a second data traffic queue in a manner So as to preferentially transmit the first data (22) Filed: Dec. 16, 2009 traffic queue or the second data traffic queue through a link.

NCOMING CREDIT

CREDIT SWW CONTROLED SHARING BIAS REGISTERS is LOCAL TRAFFIC LOCAL TRAFFIC CREDITS 26A RTTH TRAFFIC RITH TRAFFIC 28A CREDITS 28B/ 26B

26 2 Patent Application Publication Jun. 16, 2011 Sheet 1 of 2 US 2011/O142067 A1

FIG.

LINKO DEVICE

FIG. 2 Patent Application Publication Jun. 16, 2011 Sheet 2 of 2 US 2011/O142067 A1

US 2011/O 142067 A1 Jun. 16, 2011

DYNAMIC LINK CREDIT SHARING IN QPI tion only and are not intended as a definition of the limits of the invention. As used in the specification and in the claims, BACKGROUND OF THE INVENTION the singular form of “a”, “an, and “the include plural ref 0001 1. Field of the Invention erents unless the context clearly dictates otherwise. 0002 The present invention pertains to data management, and in particular to a dynamic link credit sharing method and BRIEF DESCRIPTION OF THE DRAWINGS system in quick path interconnect. 0010. In the accompanying drawings: 0003 2. Discussion of Related Art 0011 FIG. 1 is a schematic diagram showing a transmitter 0004 QuickPath Interconnect (QPI) protocol is a credit side and a receiverside of a link, according to an embodiment based protocol. In its simplest form, on a single-processor of the present invention; architecture, a single QPI is used to connect the processor to the Input-Output (IO) hub. The IO hub can in turn 0012 FIG. 2 is a schematic diagram depicting the local be connected to peripheral devices such as graphics cards, etc. and route-through traffic queues to and from a device, accord The IO hub can further communicate with an Input-Output ing to an embodiment of the present invention; and Controller Hub (e.g., 's ICH10) for connect 0013 FIG. 3 is a schematic diagram depicting an imple ing and controlling peripheral devices. mentation of a credit sharing mechanism between local data 0005 For example, QPI can be used to connect an Intel traffic queue and route-through data traffic queue at the trans Core i7 processor (a 64-bit -64 processor) to an Intel X58 mitter side of the link shown in FIG. 1, according to an IO hub. In more complex instances of the architecture, sepa embodiment of the present invention. rate QPI link pairs connect one or more processors and one or more IO hubs (or routing hubs) in a network on the mother DETAILED DESCRIPTION OF EMBODIMENTS board, allowing all of the components to access other com OF THE INVENTION ponents via the network. As with HyperTransport (a bidirec tional serial/parallel high-bandwidth point-to-point link), the 0014 FIG. 1 is a schematic diagram showing a transmitter QuickPath Interconnect (QPI) architecture allows for side and a receiverside of a link, according to an embodiment memory controller integration, and enables a non-uniform of the present invention. The link 10 has a transmitter side memory architecture (NUMA). (TS) 12 on one side and a receiver side (RS) 14 on the opposite side. For example, the link 10 can use the QuickPath BRIEF SUMMARY OF THE INVENTION interconnect Protocol to connect between the transmitter side 12 (e.g., an i7 processor) and a receiver side 14 0006 An aspect of the present invention is to provide a (e.g., an Intel X58 IO hub or another Intel Core i7 processor). method including dividing incoming credit into a first credit The transmitter side (TS) 12 of link 10 must “know’ in pool and a second credit pool; and allocating the first credit advance that adequate space is available on the receiver side pool for a first data traffic queue and allocating the second (RS) 14 of the link 10 before the transmitter side 12 can start credit pool for a second data traffic queue in a manner so as to a given transaction with the receiver side 14. To achieve a preferentially transmit the first data traffic queue or the sec seamless and Substantially error free transmission between ond data traffic queue through a link. the transmission side 12 and the receiverside 14 of the link 10, 0007 Another aspect of the present invention is to provide the receiverside 14 of the link 10 “informs” the transmission a system including a link having a transmitter side and a side 12 of availability of “credit.” In other words, the receiver receiver side; and a controlled bias register configured to side 14 advertises credits to the transmitter side 12. In order to divide incoming credit into a first credit pool and a second inform the transmitter side 12 of the availability of credit on credit pool. The first credit pool is allocated for a first data the receiver side 14, in one embodiment, a communication traffic queue and the second credit pool is allocated for a channel or link 16, independent from link 10, is established second data traffic queue Such that the transmitter side pref between the receiverside 14 and the transmitter side 12 of the erentially transmits the first data traffic queue or the second link 10. The receiverside 14 can then communicate with the data traffic queue through the link. transmitter side 12 via communication path or link 16 to 0008 Although the various steps of the method are inform the transmitter side 12 of availability of credit on the described in the above paragraphs as occurring in a certain receiver side 14. In this way, the transmitter side 12 would order, the present application is not bound by the order in “know” how much room or credit in terms of data size is which the various steps occur. In fact, in alternative embodi available on one or more channels on the receiverside 14. The ments, the various steps can be executed in an order different term data size is used herein to mean in general either flits (80 from the order described above or otherwise herein. bit data portions) or data packets for individual known trans 0009. These and other objects, features, and characteris mission packet types. Although, in this embodiment, the link tics of the present invention, as well as the methods of opera 16 is depicted as being independent from link 10, as it can be tion and functions of the related elements of structure and the appreciated the link 16 can be a sideband of the link 10 to combination of parts and economies of manufacture, will inform the transmitter side 12 of availability of credit at the become more apparent upon consideration of the following receiverside 14. It is noted that the components 12 and 14 are description and the appended claims with reference to the respectively referred to as transmitter side 12 and receiver accompanying drawings, all of which form a part of this side 14, when referring to transmitting data through link 10 specification, wherein like reference numerals designate cor from component 12 to component 14. As it can be appreci responding parts in the various figures. In one embodiment of ated, the components 12 and 14 can act, respectively, as the invention, the structural components illustrated herein are “receiverside” 12 and “transmitter side” 14 when data is sent drawn to scale. It is to be expressly understood, however, that from the component 14 to component 12, for example, when the drawings are for the purpose of illustration and descrip sending information through link 16. US 2011/O 142067 A1 Jun. 16, 2011

0015 For example, in a device, such as for example a transmitter side 12 of link 10 (corresponding to link 1 in FIG. server part optimized for embedded systems, the transmitter 2) includes software (S/W) controlled bias register or regis side 12 within the device services two request queues, one of ters 22, credit sharing or division logic 24 and a data traffic which is a local data traffic queue and the other is a route management engine 26. The data traffic management engine through data traffic queue. Local data traffic is data traffic that 26 includes local data traffic credit repository 26A, route is generated by a processor or processors within the device, through (RTTH) data traffic credit repository 26B, and a data Such as for example data generated by a processor or proces multiplexer (MUX) 26C. Local data traffic queue 28A origi sors within the device. Route-through data traffic is data nating from the transmitter side 12 within the device 20 and traffic that is generated externally to the device, and is simply route-through (RTTH) data traffic queue 28B routed through passing through the device, as explained further in detail in the transmitter side 12 within the device 20 are directed the following paragraphs. towards data traffic management engine 26. 0016 FIG. 2 is a schematic diagram depicting the local 0021 Specifically, local data traffic queue 28A is routed and through traffic queues to and from a device 20, according via local traffic credit repository 26A and route-through to an embodiment of the present invention. As depicted in (RTTH) data traffic queue 28B is routed via route-through FIG. 2, device 20 receives data in bound on link 0 and trans (RTTH) traffic credit repository 26B. The respective amount mits data outbound on link 1. Therefore, from the point of of local data traffic 28A and amount of RTTH data traffic 28B view of outbound link 1 from the device 20, local data traffic that pass through the data traffic management engine 26 is is generated within the device 20 by the one or more proces determined by the respective local traffic credit repository sors on the device 20, for example processors P1 and P2 (or 26A and RTTH traffic credit repository 26B. These reposito possibly returns of memory within the device 20 requested ries 26A and 26B, respectively, store the local data traffic and through link 1). Route-through data traffic is not generated RTTH traffic credits which are communicated by the receiver within the device 20, but corresponds to incoming data pack side 14 (shown in FIG. 1) of the link 10 to the transmitter side ets through link 0. The route-through data traffic is not des 12. The data multiplexer 26C multiplexes the local data traffic tined for the device 20, but instead is data traffic incoming 28A and RTTH data traffic and the resulting multiplexed data through link 0 being routed through to go out link 1. is transmitted through outbound link 10. 0017. Due to architectural limitations, the local data traffic (0022. The local traffic credit repository 26A and the RTTH queue originating from the device 20 and route-through data traffic repository 26B are controlled by credit sharing or traffic queue routed through the device 20 must be informed division logic 24. The credit sharing or division logic 24 that a data transaction between the device 20 and other receives inputs from bias register(s) 22 and from the receiver devices (e.g., peripheral devices or an IO hub) can be com side 14 which communicates the available credit (as incom pleted prior to pulling the data transaction from the queue. In ing credit) to the transmitter side 12 via communication path other words, the local data traffic queue originating from the or link 16. The S/W controlled bias register(s) 22 determine transmitter side 12 within the device 20 and route-through how much credit is available in terms of local traffic credits data traffic queue routed through the transmitter side 12 for the local data traffic queue 28A and RTTH traffic credits within the device 20 should be informed that a data transac for the RTTH data traffic queue 28B. tion between the transmitter side 12 within the device 20 and 0023 The bias register(s) 22 inputs bias values to the the receiver side 14 within an IO hub, for example, can be credit sharing or division logic 24 so that the credit sharing or completed prior to pulling the transaction from the local data division logic 24 divides or controls the available credit traffic queue or the route-through data traffic queue. Prior to incoming through link 16 appropriately into an amount or sending “data”, the transmitter side 12 should have credit that pool of local traffic credit stored in local traffic credit reposi guarantee that there is space at the receiverside 14 for receiv tory 26A and into an amount or pool of RTTH traffic credit ing the data (e.g., storage space). stored in RTTH traffic credit repository 26B. By dividing the 0018 Credit consumption is managed at the transmitter incoming credit into a local traffic credit and a RTTH traffic side 12. Therefore, credits sent by the receiver side 14 of the credit and allocating more credit to the local data traffic queue link 10 (corresponding to link 1 in FIG. 2) to the transmitter 28A or to the RTTH data traffic queue, the local data traffic side 12 of the link 10 (the transmitter side 12 residing within queue 28A or the RTTH data traffic queue 28B is preferen the device 20) via link 16 should be divided appropriately by tially transmitted or is given preferential bandwidth through the transmitter side 12 into two separate credit pools (a first link 10. For instance, if the RTTH data traffic queue is biased credit pool and a second credit pool). For example, the first with 16 credits, and both queues (i.e., the local data traffic credit pool can be allocated to the local data traffic and the queue and the RTTH data traffic queue) are initially empty, second credit pool can be allocated to the route-through traf the system interprets as if the route through (RTTH) data fic. traffic queue already possesses 16 credits, and the system 0019. By judiciously dividing, at the transmitter side 12, would not “think' (i.e., conclude) that the two data traffic the credit available at the receiver side 14 and advertised by queues are “equal until the local data traffic queue reaches 16 the receiverside 14 to the transmitter side 12, into a first credit credits as well. As a result, the system can transmit preferen pool allocated to the local data traffic and a second credit pool tially the local data traffic queue 28A. Although, the incoming allocated to the route-through traffic, for example, perfor credit is described herein as being divided into two credit mance of data transmission through link10 (corresponding to pools, it must be appreciated that the available credit or link 1) can be improved. incoming credit can be divided into two, three or more credit 0020 FIG. 3 is a schematic diagram depicting an imple pools. Each of the two, three or more credit pools can be mentation of a credit sharing or division mechanism between allocated to a specific queue and the queue that is allocated the local data traffic queue and the route-through data traffic more credit is preferentially transmitted. queue at the transmitter side 12 of link 10, according to an 0024. If no bias value is applied in the S/W controlled embodiment of the present invention. As shown in FIG. 3, the register(s) 22, the system attempts to fill each queue (i.e., the US 2011/O 142067 A1 Jun. 16, 2011

local traffic data queue and the RTTH data traffic queue) a high activity system, VNA credits could be reduced to evenly or equally. Thus, if any queue (i.e., any one of the local where they are being consumed so fast that a message class data traffic queue or the RTTH data traffic queue) begins carrying an 11-flit message would never have enough VNA using credits, the used credits are returned to the same queue credits to transmit. This is because message classes don’t get to attempt once again to match the levels between the two priority simply because they've been sitting for a longer queues. period of time. However, because VNO are message class 0025 If a bias value is applied in the S/W controlled specific, when a VNO credit is available for that message register(s) 22, the system instead attempts to maintain a dif class, the size of the message is irrelevant and the packet can ference in levels of the two queues equal to the bias. Hence, in be transmitted. an environment with few credits, one queue receives the (0029. In the case of VNO packets, because VNO packets majority of credits. As a result, the performance of the queue are allocated to each message class, they can prevent lockups. that receives the majority of credits is favored. The overall For example, if a local data traffic queue 28A is empty, and the system performance by providing an asymmetric or unbal local data traffic queue 28A has available credit (any available anced credits configuration when using the bias can be credit different from 0), the returning credit from the receiver improved. side 14 is returned to RTTH traffic credit repository 26B to be 0026. When the transmitter side 12 transmits packets to used by the route through data traffic queue 28B. By doing so, the receiverside 14through the link 10, the transmitter side 12 the possibility for a dead-lock condition can be prevented. As consumes credits. For example, when the transmitter side 12 it can be appreciated, a deadlock condition is a condition in has initially 10 credits and the transmitter side uses 3 credits which, for example, in order to do A, B must be done first, but to transmit data packets to the receiverside 14, the remaining in order to do B. A must happen first. As a result, nothing gets useable credit for the transmitter side 12 to use to transmit done. By returning the available incoming credit from the data packets is 7 credits. As the transmitted packets get pro receiver side 14 to the RTTH traffic credit repository 26B cessed on the receiver side 14, the receiver side 14 frees up instead of the local traffic credit repository 26A, the returned space to accept new packets. The availability of freed up credits can be used by the RTTH data traffic queue 28B. If the space is communicated by the receiver side 14 to the trans available credits were to be returned to the local traffic credit mitter side 12 via link 16. repository 26A and there is no local traffic, the returned credit 0027. In an embodiment, QPI protocol uses two different will not be used because there is no local traffic. As a result, types of credits. These two types of credits are a direct indi the RTTH traffic queue 28B which may need credit will be cation of available buffers on the receiverside 12. The credits “starved” and the RTTH traffic flow will be blocked, creating used by the QPI protocol can guarantee that the receiver side a deadlock situation. 12 has buffers to store or buffer the packet transmitted by the 0030. In the case of VNO credits, if there are no credits transmitter side 14. One type of credits is the VNO credits and available for either queue, i.e., no credit in either the local another type is the VNA credits. The VNO credits are trans traffic credit register 16A for use by the local data traffic action based and are allocated to individual packet classes. queue 28A and no credit in the RTTH traffic credit register There are six Such classes. In an embodiment, there are two 26B for use by the RTTH data traffic queue 28B, the credits credits for each of the six classes, with one being allocated for allocated are returned via link 16, preventing possible live the route through traffic, and one for the local traffic. The lock scenarios. As it can be appreciated, a live lock scenario is VNA credits (miscellaneous credits) are allocated to any vir a scenario in which a particular channel or queue gets starved tual channel but depend on the size of the transaction's packet. for lack of resource. For instance, if one assumes that both For VNO credits, the RTTH data traffic queue will only sup credits (i.e., the local traffic credit and the RTTH traffic credit) port one credit for any virtual channel, if that channel has a get used, and one of the credits returns from the receiverside credit already allocated to it. Although, the QPI protocol is 14 via link 16 as incoming credit, both queues (local data described herein as using two types of credits VNO and VNA, traffic queue 28A and RTTH data traffic queue 28B) have it must be appreciated that, in other embodiments, the QPI Something to transmit. Hence, arbitrarily, the credit can be protocol can use three types of credits VNO, VN1 and VNA. assigned to the local traffic credit register again 26A to be 0028 Management of VNA credit consumption is imple used by local data traffic queue 28A. The local data traffic mented as described in the above paragraphs. For VNA cred queue 28A uses the credit, and when this credit returns via its, the transmitter side 12 monitors the size of each queue and incoming link 16, the credit may be arbitrarily assigned to the as credits come back from the receiver side 14 in quanta of local traffic credit register 26A again. Hence, the local data 2/8/16 bit (equivalent to approximately 80 bit flit), the credits traffic queue 28A may arbitrarily use the credit again. If this are returned to the queue which has the lesser number of is repeated numerous times, the route through data traffic credits. In an embodiment, the basic unit of data is the 80 bit queue 28B may not be able to transmit data and may remain flit. This is generally an encoded 64bits. The variety of packet inactive for a longtime. This situation is a live-lock where the types available will typically use from 1 to 11 of these flits. RTTH data traffic queue 28A is starved. However, it can be For example, in an embodiment, inbound storage buffer on assumed that at Some point, there will be an instance were the device 20 will holdup to 128 of these flits. VNA credits are both credits (instead of only one credit) get returned, and not packet specific, and are therefore encoded in flits. If there eventually the RTTH data traffic queue or path 28B can trans is a 3 flit packet to send, there must be at least 3 VNA credits mit again. available to do so, etc. VNO credits, on the other hand, are 0031. As can be appreciated from the above paragraphs, based solely on packets for particular message classes. These the S/W controlled bias register(s) can optimally control or packets can be of varying size, but the VNO is allocated program the sharing of the VNA credits. For example, in one assuming the largest possible packet size for this message embodiment, software can be implemented to program the class. Therefore, a 3-flit packet would only take up one VNO bias register based on whether the application running on the credit. VNA credits are far more versatile. As an example, in embedded processor is local traffic intensive or route-through US 2011/O 142067 A1 Jun. 16, 2011

traffic intensive. Hence, the above described system and What is claimed: method can improve performance with a route through 1. A method comprising: mechanism for a given application to allow the biasing of dividing incoming credit into a first credit pool and a sec available resources in a way that is optimal for that applica ond credit pool; and tion. As a result, available QPI bandwidth is used judiciously allocating the first credit pool for a first data traffic queue and not wasted by dividing the bandwidth (i.e., credit) and and allocating the second credit pool for a second data allocating more bandwidth (i.e., credit) to the queue that traffic queue in a manner so as to preferentially transmit needs more resources for a given application. the first data traffic queue or the second data traffic queue 0032 For example, a system using credit division or shar through a link. ing logic on VNA may display a relatively large QPI band 2. The method according to claim 1, further comprising: receiving the first data traffic queue and the second data width. In a dual-processor-route-through (DPRTTH) enabled traffic queue, the first data traffic queue originating from system, for example, a heavy local traffic application can be a transmitter side within a device and the second data implemented to access the memory (RAM) of the second traffic queue is route-through data passing through the processor across the link between the first processor and the second processor, with transmitters and receivers on both transmitter side within the device. sides of the link. For example, if a relatively high QPI band 3. The method according to claim 2, wherein receiving the width is detected, this may suggest that the local data traffic second data traffic queue comprises receiving the second data queue is using almost all the advertised VNA credits. A route traffic queue from another device different from the device through heavy traffic application can be run to access memory including the transmitter side of the link. across the link. If the QPI bandwidth being used is high 4. The method according to claim 2, further comprising: enough this may suggest that the route-through traffic is using receiving incoming credit from a receiverside, the incom almost all the communicated or advertised VNA credits. ing credit informing the transmitter side of data space available at the receiverside. 0033. In one embodiment, in a QPI link using the credit sharing or division logic described herein, approximately all 5. The method according to claim 4, wherein receiving VNA credits are used. Hence, there are less VNA credits incoming credit from the receiver side comprises receiving available than would be necessary to allow maximum theo the incoming credit through another link different from the retical bandwidth from both local and route through traffic above mentioned link. from the transmitter side. This means that, on occasion, traffic 6. The method according to claim 1, wherein dividing the is held up on the transmitter side for lack of credits to send incoming credit into the first credit pool and the second credit across the link (e.g., waiting for “returning credit”). This pool comprises dividing unequally the incoming credit into could happen to either or both paths. By tuning the bias to the the first credit pool and into the second credit pool. application, it is possible, for instance, to prevent one path 7. The method according to claim 1, further comprising from ever getting backed up due to credit starvation, while storing the first credit pool in a first credit repository and making this a more likely possibility on the other path. For storing the second credit pool in a second credit repository. instance, if it is known in advance that there will be plenty of 8. The method according to claim 1, further comprising local traffic and relatively little route through traffic, it can be biasing the second credit pool relative to the first credit pool possible to bias against route through traffic to ensure that So as to preferentially transmit the first data traffic queue. local traffic is provide with as much bandwidth as desired. 9. The method according to claim 1, wherein the incoming 0034. Although the various steps of the method of provid credit comprises VNO credit and VNA credits. ing or printing postage indicia are described in the above 10. The method according to claim 1, wherein dividing the paragraphs as occurring in a certain order, the present appli incoming credit into the first credit pool and the second credit cation is not bound by the order in which the various steps pool comprises dividing the VNA credits in the incoming occur. In fact, in alternative embodiments, the various steps credit. can be executed in an order different from the order described 11. A system comprising: above. a link having a transmitter side and a receiver side; and 0035 Although the invention has been described in detail a controlled bias register configured to divide incoming for the purpose of illustration based on what is currently credit into a first credit pool and a second credit pool, considered to be the most practical and preferred embodi wherein the first credit pool is allocated for a first data ments, it is to be understood that such detail is solely for that traffic queue and the second credit pool is allocated for a purpose and that the invention is not limited to the disclosed second data traffic queue Such that the transmitter side embodiments, but, on the contrary, is intended to cover modi preferentially transmits the first data traffic queue or the fications and equivalent arrangements that are within the second data traffic queue through the link. spirit and scope of the appended claims. For example, it is to 12. The system according to claim 11, wherein the trans be understood that the present invention contemplates that, to mitter side of the link is configured to receive the first data the extent possible, one or more features of any embodiment traffic queue and the second data traffic queue, the first data can be combined with one or more features of any other traffic queue originating from the transmitter side within a embodiment. device and the second data traffic queue is route-through data 0.036 Furthermore, since numerous modifications and passing through the transmitter side within the device. changes will readily occur to those of skill in the art, it is not 13. The system according to claim 11, wherein the trans desired to limit the invention to the exact construction and mitter side is further configured to receive the incoming credit operation described herein. Accordingly, all Suitable modifi through a credit link from the receiver side of the link, the cations and equivalents should be considered as falling within incoming credit informing the transmitter side of data space the spirit and scope of the invention. available at the receiverside. US 2011/O 142067 A1 Jun. 16, 2011

14. The system according to claim 13, wherein the credit 17. The system according to claim 11, further comprising a link is distinct from the link. credit sharing logic controlled by the bias register. 15. The system according to claim 11, wherein the con trolled bias register is configured to divide unequally the 18. The system according to claim 11, wherein the incom incoming credit into the first credit pool and into the second ing credit comprises VNO credit and VNA credit. credit pool. 19. The system according to claim 18, wherein the con 16. The system according to claim 11, further comprising a trolled bias register is configured to divide incoming credit first credit repository and a second credit repository, the first into the first credit pool and the second credit pool comprises credit repository configured to store the first credit pool and dividing the VNA credit in the incoming credit. the second credit repository configured to store the second credit pool. c c c c c