Trisha's Ramblings My Goal? Only to Change the World
Total Page:16
File Type:pdf, Size:1020Kb
Mehr Nächster Blog» Blog erstellen Anmelden Diese Website verwendet Cookies von Google, um ihre Dienste bereitzustellen, Anzeigen zu personalisieren und Zugriffe zu analysieren. Informationen darüber, wie du die Website verwendest, werden an Google weitergegeben. WEITERE INFORMATIONEN OK Durch die Nutzung dieser Website erklärst du dich damit einverstanden, dass sie Cookies verwendet. Trisha's Ramblings My goal? Only to change the world... WEDNESDAY, 22 JUNE 2011 BLOG ARCHIVE ► 2015 (11) Dissecting the Disruptor: What's so special about a ring buffer? ► 2014 (29) Recently we open sourced the LMAX Disruptor, the key to what makes our exchange so fast. Why did ► 2013 (38) we open source it? Well, we've realised that conventional wisdom around high performance ► 2012 (46) programming is... a bit wrong. We've come up with a better, faster way to share data between threads, ▼ 2011 (50) and it would be selfish not to share it with the world. Plus it makes us look dead clever. ► December (4) ► November (6) On the site you can download a technical article explaining what the Disruptor is and why it's so clever and fast. I even get a writing credit on it, which is gratifying when all I really did is insert commas and ► October (4) re-phrase sentences I didn't understand. ► September (5) ► August (3) However I find the whole thing a bit much to digest all at once, so I'm going to explain it in smaller ► July (5) pieces, as suits my NADD audience. ▼ June (5) First up - the ring buffer. Initially I was under the impression the Disruptor was just the ring buffer. But Dissecting the Disruptor: I've come to realise that while this data structure is at the heart of the pattern, the clever bit about the How do I read from the r... Disruptor is controlling access to it. Dissecting the Disruptor: What's so special about ... What on earth is a ring buffer? A chance to see some of my Well, it does what it says on the tin - it's a ring (it's circular and wraps), and you use it as a buffer to actual code (even if it... pass stuff from one context (one thread) to another: Vote for the LJC STAC London Summit ► May (4) ► April (4) ► February (1) ► January (9) ► 2010 (2) ► 2009 (4) (OK, I drew it in Paint. I'm experimenting with sketch styles and hoping my OCD doesn't kick in and ► 2008 (16) demand perfect circles and straight lines at precise angles). ► 2007 (11) So basically it's an array with a pointer to the next available slot. Public Appearances ABOUT ME Trisha Gee View my complete profile GOOGLE+ FOLLOWERS Trisha Gee As you keep filling up the buffer (and presumable reading from it too), the sequence keeps Add to circles incrementing, wrapping around the ring: To find the slot in the array that the current sequence points to you use a mod operation: 850 have me in circles View all sequence mod array length = array index FOLLOWERS So for the above ring buffer (using Java mod syntax): 12 % 10 = 2. Easy. Join this site with Google Friend Connect Actually it was a total accident that the picture had ten slots. Powers of two work better because Members (146) More » computers think in binary. So what? If you look at Wikipedia's entry on Circular Buffers, you'll see one major difference to the way we've implemented ours - we don't have a pointer to the end. We only have the next available sequence number. This is deliberate - the original reason we chose a ring buffer was so we could support reliable messaging. We needed a store of the messages the service had sent, so when another service sent a nak to say they hadn't received some messages, it would be able to resend them. Already a member? Sign in The ring buffer seems ideal for this. It stores the sequence to show where the end of the buffer is, and if it gets a nak it can replay everything from that point to the current sequence: LABELS conferences (53) java (35) gender (25) disruptor (22) video (21) ljc (20) links (17) presentations (17) The difference between the ring buffer as we've implemented it, and the queues we had traditionally MongoDB (16) been using, is that we don't consume the items in the buffer - they stay there until they get over-written. agile (10) Which is why we don't need the "end" pointer you see in the Wikipedia version. Deciding whether it's ui (8) OK to wrap or not is managed outside of the data structure itself (this is part of the producer and code (5) consumer behaviour - if you can't wait for me to get round to blogging about it, check out the Disruptor groovy (5) site). reference (5) And it's so great because...? opinion (3) So we use this data structure because it gives us some nice behaviour for reliable messaging. It turns out though that it has some other nice characteristics. There was an error in this gadget Firstly, it's faster than something like a linked list because it's an array, and has a predictable pattern of access. This is nice and CPU-cache-friendly - at the hardware level the entries can be pre-loaded, so the machine is not constantly going back to main memory to load the next item in the ring. Secondly, it's an array and you can pre-allocate it up front, making the objects effectively immortal. This means the garbage collector has pretty much nothing to do here. Again, unlike a linked list which creates objects for every item added to the list - these then all need to be cleaned up when the item is no longer in the list. The missing pieces I haven't talked about how to prevent the ring wrapping, or specifics around how to write stuff to and read things from the ring buffer. You'll also notice I've been comparing it to a data structure like a linked list, which I don't think anyone believes is the answer to the world's problems. The interesting part comes when you compare the Disruptor with an implementation like a queue. Queues usually take care of all the stuff like the start and end of the queue, adding and consuming items, and so forth. All the stuff I haven't really touched on with the ring buffer. That's because the ring buffer itself isn't responsible for these things, we've moved these concerns outside of the data structure. For more details you're just going to have to read the paper or check out the code. Or watch Mike and Martin at QCon San Francisco last year. Or wait for me to have a spare five minutes to get my head around the rest of it. Posted by Trisha Gee at 16:01 Labels: data structures, disruptor, disruptor-docs, java, lmax Location: London W11, UK 21 comments: Flying Frog Consultancy Ltd. 4 July 2011 at 10:11 If you don't consume elements from your ring buffer then you're keeping them reachable and preventing them from being deallocated. This can obviously have an adverse effect on the throughput and latency of the garbage collector. Writing references into different locations in your ring buffer incurs the write barrier, which can also adversely affect throughput and latency. I wonder what the trade-offs are concerning these disadvantages and when they come into play. Reply Michael Barker 4 July 2011 at 21:36 With regards to use of memory, no real trade offs are made by the Disruptor. Unlike a queue, you have a choice about how to make use of memory. If solution is a soft real-time system, reducing GC pauses is paramount. Therefore you can re-use the entries in the ring buffer, e.g. copying byte arrays to and from network I/O buffers in and out of the ring buffer (our most common usage pattern). As the amount of memory used by the system remains static is reduces the frequency of garbage collection. It is also possible to implement an Entry that contains a reference to an immutable object. However in that situation it may be necessary for the consumer to null out the message object to reduce the amount of memory that needs to be promoted from Eden. So a little more effort is required from the programmer to build the most appropriate solution. We believe that the flexibility provided justifies this small bit of extra effort. Considering the write barrier, the primary goal of the Disruptor is to pass messages between threads. We make no trade offs regarding ordering or consistency, therefore it is necessary to use memory barriers in the appropriate places. We've done our utmost to keep this to a minimum. However, we are many times faster than the popular alternatives as most of them use locks provide consistency. Mike. Reply sauron 26 July 2011 at 08:45 How does this approach compare to the Pool approach and other approaches used here: http://cacm.acm.org/magazines/2011/3/105308-data-structures-in-the-multicore-age/fulltext Why not use a Pool instead of a queue? Is the LIFO requirement essential? Reply Trisha 26 July 2011 at 09:34 Unfortunately I can't read that article because I don't have an account at that site. FIFO (not LIFO) is absolutely essential - our exchange depends upon predictable ordering, and if you play the same events into it you will always get the same outcome.