Msc THESIS Performance Estimation of a Processor Module in the Real-Time Motion Control Platform of an ASML Lithostepper
Total Page:16
File Type:pdf, Size:1020Kb
Computer Engineering 2018 Mekelweg 4, 2628 CD Delft The Netherlands http://ce.et.tudelft.nl/ MSc THESIS Performance Estimation of a Processor Module in the Real-Time Motion Control Platform of an ASML Lithostepper Joachim Adriaan Voskes Abstract At the core of state of the art microelectronic industry's drive for better technology, lies the continuing advancement in the develop- ment of Integrated Circuits using highly complex lithography ma- chines, known as lithosteppers, which embed complex mechanical sub-systems performing intricate motions. These systems are con- trolled by means of custom real-time computing platforms contain- CE-MS-2018-12 ing off-the-shelf and specialized hardware components, and are opti- mized to keep pace with the continuing growing trend in performance requirements. At the ASML Twinscan lithostepper's heart resides the Control Architecture Reference Model (CARM) motion control platform which manages, among others, the wafer-stage, a multiple degree of freedom module, able to position a 15 kg heavy wafer-table with nanometer accuracy at extremely high acceleration and veloc- ity. As the industry requirements for feature-size, overlay accuracy, and throughput keep increasing, the ASML lithosteppers, and the CARM platform in particular, should anticipate these demands by making early changes and upgrades with respect to computation- al performance and accuracy. Given that the current lithostepper configurations are not capable of sustaining the anticipated updates, which requires the increase of the control loop execution frequency from 20kHz to 40kHz, an early evaluation of potential CARM High Performance Process Controller (HPPC) successors has been performed. This indicated that the NXP-Freescale T4240 processor can potentially ful- fill the expected requirements, however, the evaluation lacks accuracy as it was performed on a benchmark code not reflecting the actual CARM workload. To circumvent this problem, in this thesis, we introduce a more accurate evaluation methodology, which relies on the actual motion control application running on the HPPC and is able to capture aspects as scheduling, parallelism, and processor resource usage. To this end we develop a set of custom performance benchmarks able to emulate the CARM environment and evaluate the Freescale T4240 processor in this new context. Our results indicate that the T4240 is able to deliver enough computation power to fulfill the control loop execution frequency upscaling requirement from 20kHz to 40kHz. Additionally, we demonstrate that due to its clustered hardware architecture one T4240 can sustain 20kHz loop execution frequency for the workload of three current HPPCs, which suggest that its utilization in current lithosteppers can be beneficial. Faculty of Electrical Engineering, Mathematics and Computer Science Performance Estimation of a Processor Module in the Real-Time Motion Control Platform of an ASML Lithostepper THESIS submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in EMBEDDED SYSTEMS by Joachim Adriaan Voskes born in Amsterdam, The Netherlands Computer Engineering Department of Electrical Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University of Technology Performance Estimation of a Processor Module in the Real-Time Motion Control Platform of an ASML Lithostepper by Joachim Adriaan Voskes Abstract At the core of state of the art microelectronic industry's drive for better technology, lies the continuing advancement in the development of Integrated Circuits using highly complex lithogra- phy machines, known as lithosteppers, which embed complex mechanical sub-systems performing intricate motions. These systems are controlled by means of custom real-time computing plat- forms containing off-the-shelf and specialized hardware components, and are optimized to keep pace with the continuing growing trend in performance requirements. At the ASML Twinscan lithostepper's heart resides the Control Architecture Reference Model (CARM) motion control platform which manages, among others, the wafer-stage, a multiple degree of freedom module, able to position a 15 kg heavy wafer-table with nanometer accuracy at extremely high accelera- tion and velocity. As the industry requirements for feature-size, overlay accuracy, and throughput keep increasing, the ASML lithosteppers, and the CARM platform in particular, should anticipate these demands by making early changes and upgrades with respect to computational performance and accuracy. Given that the current lithostepper configurations are not capable of sustaining the anticipated updates, which requires the increase of the control loop execution frequency from 20kHz to 40kHz, an early evaluation of potential CARM High Performance Process Controller (HPPC) successors has been performed. This indicated that the NXP-Freescale T4240 processor can potentially fulfill the expected requirements, however, the evaluation lacks accuracy as it was performed on a benchmark code not reflecting the actual CARM workload. To circumvent this problem, in this thesis, we introduce a more accurate evaluation methodology, which relies on the actual motion control application running on the HPPC and is able to capture aspects as scheduling, parallelism, and processor resource usage. To this end we develop a set of custom per- formance benchmarks able to emulate the CARM environment and evaluate the Freescale T4240 processor in this new context. Our results indicate that the T4240 is able to deliver enough computation power to fulfill the control loop execution frequency upscaling requirement from 20kHz to 40kHz. Additionally, we demonstrate that due to its clustered hardware architecture one T4240 can sustain 20kHz loop execution frequency for the workload of three current HPPCs, which suggest that its utilization in current lithosteppers can be beneficial. Laboratory : Computer Engineering Codenumber : CE-MS-2018-12 Committee Members : Advisor: Dr.ir. S.D. Cotofana, CE, TU Delft Chairperson: Prof. Koen Bertels, CE, TU Delft i Member: Dr. ir. T.G.R.M van Leuken, CAS, TU Delft Member: Ing. J van de Ven , Motion Control and Subsystem Facilities, ASML Netherlands ii Dedicated to my family and friends iii iv Contents List of Figures viii List of Tables ix List of Acronyms xii Acknowledgements xiii 1 Introduction 1 1.1 Nano Lithographic Challenges . .2 1.2 Problem Statement . .2 1.3 Thesis Contributions . .4 1.4 Thesis Organization . .5 2 Background and Preliminaries 7 2.1 ASML Lithostepper . .7 2.2 CARM Motion Control Platform . .9 2.2.1 Software Platform and Controller Modeling . 10 2.2.2 Hardware Platform . 12 2.3 CARM Hardware Limitations and Upgrade . 12 2.4 T4240 Processor . 14 2.5 Conclusion . 15 3 CARM Application Scheduling and Execution 17 3.1 Motion Control Application . 17 3.1.1 WorkerBlocks . 18 3.1.2 Sequences and Schedule . 19 3.1.3 Synchronization and Communication . 20 3.2 Previous Work . 21 3.2.1 Performance Estimation . 21 3.2.2 Experiments . 22 3.2.3 Results . 23 3.3 Conclusion . 25 4 CARM Custom Performance Estimation 27 4.1 Multi-core Performance Estimation and Benchmark Creation . 27 4.2 Round Trip Latency . 28 4.2.1 Implementation . 29 4.3 Parallel Sequences . 30 4.3.1 Implementation . 31 v 4.3.2 Experiments . 34 4.4 Experimental Evaluation Platform . 35 4.4.1 Baseline Validation . 36 4.5 Results . 37 4.5.1 Round Trip Latency . 37 4.5.2 Parallel Sequences . 39 4.6 Conclusion . 41 5 Conclusions and Future Work 43 5.1 Summary . 43 5.2 Future Work . 44 5.3 Research Implications . 45 5.3.1 CARM Team and ASML . 45 5.3.2 Industry . 45 Bibliography 48 A Appendix 49 B Appendix 53 vi List of Figures 1.1 Schematic representation of the photolithography process . .1 2.1 An ASML Twinscan lithography machine. .8 2.2 Abstract representation of a Long-Stroke, Short-Stroke wafer- positioning stage in H-configuration, where the actuators act on the wafer-table in a lithostepper. .9 2.3 Abstract representation of a control loop . .9 2.4 Organization of the CARM layers . 11 2.5 CARM ATCA compute stack. 13 2.6 Abstract representation of the T4240 processor architecture. Figure taken from Freescale T4240 processor factsheet [1] . 14 2.7 Abstract block diagram of the T4240's E6500-core pipeline architecture. Figure taken from the Freescale T4240 Reference Manual [2] . 15 3.1 Blockdiagram representation of an example of the control loop for the SS-controller application [3]. 17 3.2 Abstract overview of the deployment of an application on the CARM hardware (Host and HPPCs). 18 3.3 Schematic representation of the critical and non critical part of the sam- ple. ...................................... 20 3.4 Visual representation of a schedule calculated for an application running on 7 cores of an HPPC. The red (left) and green (right) sections represent time critical and non-time critical parts respectively, while the arrows represent a communication or synchronization between WorkerBlocks running on separate cores. [4] . 21 4.1 Schematic representation of the round trip latency experiment via the different cache levels. 29 4.2 Schematic representation of the round trip latency performance tests. 31 4.3 Abstract diagram of a block template used for the WorkerBlocks . 32 4.4 Block diagram for the stand alone sequence test. 34 4.5 Block diagram for the start-finish synchronized sequence test. 34 4.6 Block diagram for the fully synchronized sequence test. 35 4.7 Schematic representation of the execution of the benchmark running parallel sequences. 35 4.8 Abstract block diagram of the hardware setup for T4240 performance testing and validation. 36 4.9 Round Trip Latency test results for the 1 to 1 tests in ns. 37 4.10 Round Trip Latency test result for the 1 to N tests in ns. 38 vii A.1 Abstract block diagram of the T4240 core-complex architecture. As can be seen, the clusters of 4 E6500 cores are share the L2 cache, and are connected to each-other via the CoreNet Coherency Fabric and L3 cache. Figure taken from Freescale T4240 Reference Manual[2] .