Design of Efficient TLB-Based Data Classification Mechanisms in Chip

UNIVERSITAT POLITECNICA` DE VALENCIA` Dpto de Informatica´ de Sistemas y Computadores Design of Efficient TLB-based Data Classification Mechanisms in Chip Multiprocessors A dissertation submitted in fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science) Author: ALBERT ESTEVE GARCIA Advisors: Prof. ANTONIO ROBLES MARTÍNEZ Prof. MARIA ENGRACIA GOMEZ´ REQUENA Prof. ALBERTO ROS BARDISA Valencia, July 2017 Acknowledgment Cuatro a~nosde esfuerzo dan para mucho. Echándola vista atrás,muchas son las personas que me han apoyado, muchos los que me han escuchado. Gracias a todos. A Ramóny Lola. Sin ellos, sin su apoyo y su amor, no habr´ıallegado hasta aqu´ı. Sin su inspiraciónno habr´ıasiquiera empezado el camino. Ellos plantaron la semilla de la curiosidad, tan necesaria para superar la frustracióny las dificultades a las que todo investigador tiene que enfrentar en su camino. A V´ıctor.Hermano y amigo. El espejo en el que me miro. Me has dado fuerza en cada etapa de mi vida. Gracias por todo. Habráque pensar una buena ruta en bici para celebrarlo. A Carla. Por su paciencia, por su ayuda, por respaldarme cuando lo he necesitado. Siempre me has animado a seguir adelante y has permanecido a mi lado en cada etapa del doctorado. Másque eso, eres uno de los pilares principales de mi vida. A Salomé,Eloy y Sergio. Por quererme y aceptarme como uno másde la familia. Realmente yo lo siento del mismo modo. Gracias. A mis amigos. Samuel, Eduardo, Manolo. Ya sea jugando a algúnjuego, organizándoviajes juntos, o saliendo de fiesta o a tomar algo, siempre habéisestado ah´ı,siendo part´ıcipes de mis logros y mis frustraciones. Tanto como yo de las vuestras. Y espero que siga siendo as´ıpor muchos, muchos a~nos. Allápor 2007 empezaba mi andadura por la universidad, hace ya diez a~nos. Esta´ ha sido mi segunda casa, y le debo mucho. Y tengo muchos que agradecer tambiéna toda la gente que he conocido en este etapa. Incluyendo éstaúltimafase en el Grupo de Arquitecturas Paralelas. Por supuesto y ante todo a mis directores, Alberto, Antonio y Maria Engracia. Me habéis guiado e inspirado, desafiado y ayudado en cada etapa de mi doctorado. Vuestra paciencia y esfuerzo han sido claves para la consecuciónde este trabajo. He aprendido mucho durante estos a~nos.No sóloen materia de arquitectura de computadores, sino tambiénen investigación cient´ıfica; a amarla, a observar los peque~nosdetalles, a analizar los datos y entender qué esta ocurriendo para as´ıpoder seguir avanzando. Ellos han cambiado mi vida, profesional y personal, para siempre. A Núria,Salva y Eduardo. No nos vemos tanto como antes, nuestras vidas, sus obligaciones, nos alejan. Pero aúnnos quedaránesas tardes de cervezas para ponernos al d´ıa.Siempre. A mis compa~nerosde laboratorio. JoséVicente, JoséMar´ıa,Migue, Vicent, Joan, Fran, Javi, Roberto, Carlos, Santi, y un largo etcétera.A los que se fueron, Knut, Mario, Crisp´ın.Tantos y tantos. A todos os debo algo. Lo que ha unido el BoardGameArena que no lo separen nuestras exitosas carreras en el extranjero, como un tren de mercanc´ıasdesbocado y sin frenos. Y por supuesto a Ricardo. El verdadero pilar del laboratorio, en lo técnicoy en lo personal. Tu increible paciencia y dedicaciónte hacen pieza central del trabajo de todos nosotros en el grupo. A Stefanos Kaxiras y a toda la gente que conoc´ıen Uppsala. Trevor, David, Alexandra, Andra, etc. As tough as it was, it has been one of the most enriching experiences of my life. I cannot thank you enough. Observing first hand how another research group works really changed my i perspective towards investigation. Specially Kaxiras, your wisdom was inspirational. Thank you all. Already missing the snow. A.E. ii Contents Acknowledgment..........................................................i Preface................................................................ vii List of Acronyms.......................................................... viii List of Figures............................................................ xii List of Tables............................................................ xvi Abstract............................................................... xix Resum................................................................. xx Resumen............................................................... xxi 1 Introduction 1 1.1 Context and Motivation..................................................1 1.2 Objectives............................................................3 1.3 Thesis Contributions....................................................3 1.4 Thesis Outline.........................................................6 2 Concepts and Background 7 2.1 Introduction..........................................................7 2.2 Classification-based Optimizations...........................................7 2.3 Data Classification Mechanisms............................................. 13 2.4 Architecting and Managing TLBs............................................ 17 3 Simulation Environment 23 3.1 Introduction.......................................................... 23 3.2 Simulation Tools....................................................... 24 3.3 Simulated System....................................................... 26 3.4 Metrics.............................................................. 27 iii Contents 3.5 Benchmarks.......................................................... 28 4 TLB-based Classification Mechanisms 37 4.1 Introduction.......................................................... 37 4.2 TLB Miss Resolution through TLB-to-TLB Transfers.............................. 38 4.3 Snooping TLB-based Private-Shared Classification................................ 38 4.4 TLB-based Classification with Distributed Shared TLBs............................. 42 4.5 Experimental Results.................................................... 46 4.6 Discussion............................................................ 54 4.7 Conclusions........................................................... 55 5 Token-counting TLB-based Classification Mechanism 57 5.1 Introduction.......................................................... 57 5.2 TokenTLB........................................................... 58 5.3 Read-only Data Optimizations and Full-adaptivity................................ 63 5.4 Experimental Results.................................................... 64 5.5 Conclusions........................................................... 71 6 Prediction-based Classification Mechanisms 73 6.1 Introduction.......................................................... 73 6.2 Usage Predictor (UP).................................................... 74 6.3 Shared Usage Predictor (SUP).............................................. 78 6.4 Approaching to the Ideal Scheme............................................ 81 6.5 Experimental Results.................................................... 83 6.6 Conclusions........................................................... 91 7 Cooperative TLB Page-Usage Prediction Mechanism 93 7.1 Introduction.......................................................... 93 7.2 Cooperative Usage Predictor (CUP).......................................... 95 7.3 Experimental Results.................................................... 100 7.4 Discussion............................................................ 105 7.5 Conclusions........................................................... 105 8 Conclusions 107 8.1 Contributions and Conclusions.............................................. 107 8.2 Scientific Publications.................................................... 111 8.3 Future Work.......................................................... 112 iv Contents Bibliography 113 v Preface Preface The current document has been elaborated with the aim to obtain a PhD degree in Computer Science. This work has been done under the advising and guidance of professors Antonio Robles, Alberto Ros, and Maria Engracia Gómez. This document can be divided in three parts. The first part introduced the state-of-art in classification approaches and private-based data optimizations. The second part present all the research work and its evaluation. This research activity has been developed within the Parallel Architectures Group (GAP) in the Department of Computer Engineering (DISCA) at the Universitat Politècnicade València. In the last part, the contributions and conclusions are summarized. vii List of Acronyms TLB Translation Lookaside Buffer. CS-TLB Complete Subblock TLB. PS-TLB Partial Subblock TLB. CMP Chip Multiprocessor. NUCA Non-Uniform Cache Architecture. LLC Last-Level Cache. SC Sequential Consistency. ROB Reorder Buffers. TSO Total Store Ordering. RAWR Read-After-Write Races. NSRT Not Shared Region Table. CRH Cached Region Hash. TC Temporal Coherence. DRF Data Race Free. HRF Heterogeneous Race Free. nDRF Non Data Race Free. xDRF Extended Data Race Free. ROB Reorder Buffers. hLRC Heterogeneous Lazy Release Consistency. ix List of Acronyms PR Private Read-only. PW Private Written. SR Shared Read-only. SW Shared Written. OS Operating System. VIPS Valid/Invalid Private/Shared (set of cache coherence states). MOESI Modified Owned Exclusive Shared Invalid (set of cache coherence states). MESI Modified Exclusive Shared Invalid (set of cache coherence states). SWEL Shared Written Exclusivity-Level (set of cache coherence states). UNITD Unified Instruction/Translation/Data Coherence. DiDi Dictionary Directory. FAC First Accessing

Design of Efficient TLB-Based Data Classification Mechanisms in Chip

Content Marketing Members-Only Conference

Thermal-Aware Scheduling in Openmp

The GPU Computing Revolution

Exjobb Text V1.Pages

Company Vendor ID (Decimal Format) (AVL) Ditest Fahrzeugdiagnose Gmbh 4621 @Pos.Com 3765 0XF8 Limited 10737 1MORE INC

Optical Reconfigurable Interconnection Networks for Shared-Memory

Intel CES 2018 Booth Demonstrations

Publications Core Magazine, 2012 This

What Lives Inside’

Pulsartm.2 SATA

The History of the Microprocessor- Autumn 1997

Embedded Forum at Electronica, a Theatre-Style Presentation Area Located in Hall A6, the Dedicated “Embedded Hall”