Computational Astrophysics Computer Architectures Alexander Knebe, Universidad Autonoma De Madrid Computational Astrophysics Computer Architectures

Computational Astrophysics Computer Architectures Alexander Knebe, Universidad Autonoma De Madrid Computational Astrophysics Computer Architectures

Computational Astrophysics Computer Architectures Alexander Knebe, Universidad Autonoma de Madrid Computational Astrophysics Computer Architectures ! architectures ! real machines ! computing concepts ! parallel programming Computational Astrophysics Computer Architectures ! architectures ! real machines ! computing concepts ! parallel programming Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! very primitive commands, obtained from compilers or interpreters of higher-level languages Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! very primitive commands, obtained from compilers or interpreters of higher-level languages ! cycle chain: • fetch – get instruction and/or data from memory • decode – store instruction and/or data in register • execute – perform instruction Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! very primitive commands, obtained from compilers or interpreters of higher-level languages ! cycle chain: • fetch – get instruction and/or data from memory • decode – store instruction and/or data in register • execute – perform instruction arithmetical/logical instructions: +, -, *, /, bitshift, if Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! very primitive commands, obtained from compilers or interpreters of higher-level languages ! cycle chain: • fetch – get instruction and/or data from memory • decode – store instruction and/or data in register • execute – perform instruction ! some CPU allow multi-threading, i.e. already fetch next instruction while still executing Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! execution time: t = ni x CPI x tc number of instructions time per cycle cycles per instruction (e.g., ‘+’ requires less cycles than ‘*’) Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! execution time: t = ni x CPI x tc number of instructions time per cycle cycles per instruction speed-ups: improve your algorithm to require less instructions example: a factor like “3/(8piG)” inside a for-loop should be avoided; define FAC=3/(8piG) outside the loop and use FAC inside the loop instead… Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! execution time: t = ni x CPI x tc number of instructions time per cycle cycles per instruction (how many cycles does your instruction require) speed-ups: improve your algorithm to use more adequate instructions example: avoid at all costs pow(), log(), etc., e.g. pow(x,2) should be replaced with x*x Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! execution time: t = ni x CPI x tc number of instructions time per cycle cycles per instruction speed-ups: buy a machine with higher clock-frequency Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! execution time: t = ni x CPI x tc number of instructions time per cycle cycles per instruction speed-ups: improve your algorithm! Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU: ! execution time: t = ni x CPI x tc number of instructions time per cycle cycles per instruction speed-ups: ...or wait for technology to advance ;-) Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU – evolution during the past years: Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU – evolution during the past years: clock speed: saturation level reached! Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM CPU – evolution during the past years: single thread: saturation approaching! clock speed: saturation level reached! Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM RAM: ! Random Access Memory, i.e. read and write ! storage in binary system: • 1 bit = 0 or 1 • 8 bits = 1 byte • 4 bytes = 1 float (=32 bits, standard for 32-bit architectures) • 8 bytes = 1 double (=64 bits, standard for 64-bit architectures) Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM RAM: ! Random Access Memory, i.e. read and write ! storage in binary system: • 1 bit = 0 or 1 • 8 bits = 1 byte • 4 bytes = 1 float (=32 bits, standard for 32-bit architectures) • 8 bytes = 1 double (=64 bits, standard for 64-bit architectures) ! latency = time for memory access (bus width also relevant) Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM RAM: ! Random Access Memory, i.e. read and write ! storage in binary system: • 1 bit = 0 or 1 • 8 bits = 1 byte • 4 bytes = 1 float (=32 bits, standard for 32-bit architectures) • 8 bytes = 1 double (=64 bits, standard for 64-bit architectures) ! latency = time for memory access (bus width also relevant) ! speed-ups: • multi-threading CPU’s • larger bus width Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM RAM: ! Random Access Memory, i.e. read and write ! storage in binary system: • 1 bit = 0 or 1 • 8 bits = 1 byte • 4 bytes = 1 float (=32 bits, standard for 32-bit architectures) • 8 bytes = 1 double (=64 bits, standard for 64-bit architectures) ! latency = time for memory access (bus width also relevant) ! speed-ups: • multi-threading CPU’s • larger bus width: ! ‘80s 8-bit wide ! ‘90s 16-bit wide (internal ‘highway’ for data transfer) ! ‘00s 32-bit wide ! today 64-bit wide Computational Astrophysics architectures Computer Architectures ! serial machine = CPU RAM RAM: ! Random Access Memory, i.e. read and write ! storage in binary system: • 1 bit = 0 or 1 • 8 bits = 1 byte • 4 bytes = 1 float (=32 bits, standard for 32-bit architectures) • 8 bytes = 1 double (=64 bits, standard for 64-bit architectures) ! latency = time for memory access (bus width also relevant) ! speed-ups: • multi-threading CPU’s • larger bus width • clever usage of Cache Computational Astrophysics architectures Computer Architectures ! serial machine = CPU Cache RAM RAM: ! Random Access Memory, i.e. read and write ! storage in binary system: • 1 bit = 0 or 1 • 8 bits = 1 byte • 4 bytes = 1 float (=32 bits, standard for 32-bit architectures) • 8 bytes = 1 double (=64 bits, standard for 64-bit architectures) ! latency = time for memory access (bus width also relevant) ! speed-ups: • multi-threading CPU’s • larger bus width • clever usage of Cache Computational Astrophysics architectures Computer Architectures ! serial machine = CPU Cache RAM Cache: ! Random Access Memory, i.e. read and write ! built into motherboard next to CPU ! when ‘fetch a[i]’ also ‘fetch a[i+1]’ into Cache (in fact, full lines or pages are “cached”) ! nowadays multiple Cache levels ! bad programming will lead to “Cache misses”: • t = f x tCache + ( 1- f ) x tRAM Cache hit rate Cache access time RAM access time Computational Astrophysics architectures Computer Architectures ! serial machine = CPU Cache RAM Cache: ! Random Access Memory, i.e. read and write ! built into motherboard next to CPU ! when ‘fetch a[i]’ also ‘fetch a[i+1]’ into Cache (in fact, full lines or pages are “cached”) ! nowadays multiple Cache levels ! bad programming will lead to “Cache misses”: • t = f x tCache + ( 1- f ) x tRAM Cache hit rate Cache access time RAM access time f 0.1, t 10ns, t 100ns 91ns example: = Cache = RAM = ⇒ f = 0.9, tCache =10ns, tRAM =100ns ⇒ 19ns € Computational Astrophysics architectures Computer Architectures ! serial machine = CPU Cache RAM Cache: ! Random Access Memory, i.e. read and write ! built into motherboard next to CPU ! when ‘fetch a[i]’ also ‘fetch a[i+1]’ into Cache (in fact, full lines or pages are “cached”) ! nowadays multiple Cache levels ! bad programming will lead to “Cache misses”: • t = f x tCache + ( 1- f ) x tRAM Cache hit rate Cache access time RAM access time f 0.1, t 10ns, t 100ns 91ns example: = Cache = RAM = ⇒ f = 0.9, tCache =10ns, tRAM =100ns ⇒ 19ns € Computational Astrophysics architectures Computer Architectures ! serial machine = CPU Cache RAM Cache: ! Random Access Memory, i.e. read and write ! built into motherboard next to CPU ! when ‘fetch a[i]’ also ‘fetch a[i+1]’ into Cache (in fact, full lines or pages are “cached”) ! nowadays multiple Cache levels ! bad programming will lead to “Cache misses”: • t = f x tCache + ( 1- f ) x tRAM Cache hit rate Cache access time RAM access time f 0.1, t 10ns, t 100ns 91ns example: = Cache = RAM = ⇒ f = 0.9, tCache =10ns, tRAM =100ns ⇒ 19ns € Computational Astrophysics architectures Computer Architectures ! serial machine = CPU Cache RAM Cache: ! Random Access Memory, i.e. read and write ! built into motherboard next to CPU ! when ‘fetch a[i]’ also ‘fetch a[i+1]’ into Cache (in fact, full lines or pages are “cached”) ! nowadays multiple Cache levels ! bad programming will lead to “Cache misses”: • t = f x tCache + ( 1- f ) x tRAM Cache hit rate Cache access time RAM access time f = 0.1, t =10ns, t =100ns ⇒ 91ns example: Cache RAM factor of 4.5! f = 0.9, tCache =10ns, tRAM =100ns ⇒ 19ns € Computational Astrophysics architectures Computer Architectures ! serial machine = CPU Cache RAM Cache: ! Random Access Memory, i.e. read and write ! built into motherboard next to CPU ! when ‘fetch

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    151 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us