Fault Tolerance and Security

Fault Tolerance and Security

Fault Tolerance and Security Heechul Yun 1 Safety Failures in CPS Therac 25 Arian 5 • Computer controlled medical X-ray • 7 billion dollar rocket was destroyed after 40 treatments secs (6/4/1996) • Six people died/injured due to massive overdoses (1985-1987) • “caused by the complete loss of guidance and • Caused by synchronization mistakes altitude information ” Caused by 64bit floating to 16bit integer conversion 2 Safety Failures in CPS http://www.nytimes.com/2015/01/28/us/white-house-drone.html http://petapixel.com/2015/12/23/crashing-camera-drone-narrowly-misses-top-skiier/ http://rochester.nydatabases.com/map/domestic-drone-accidents http://www.nytimes.com/interactive/2016/07/01/business/inside-tesla-accident.html Failures in CPS have consequences 3 Safety Threats in CPS • Cyber System (i.e., computer) – Software bugs – Hardware bugs • Physical System (i.e., plant) – Sensor inaccuracies – Actuator malfunctions/physical damages 4 Safety Threats in CPS • Cyber System (i.e., computer) – Software bugs • Logical, temporal – Hardware bugs • Permanent, transient • Physical System (i.e., plant) – Sensor inaccuracies – Actuator malfunctions/physical damages 5 Safety Challenges: Software • Increasing complexity – E.g., Linux: > 15M SLOC • Concurrency – Multithreading is hard https://www.quora.com/How-many-lines-of-code-are-in-the-Linux-kernel • Timing unpredictability – Shared resource contention affects timing Software bugs are hard to weed out 6 Safety Challenges: Hardware • Hardware bugs – Pentium floating point bug (FDIV bug), circa 1994 – Intel CPU bugs in 2015: http://danluu.com/cpu-bugs/ • “Certain Combinations of AVX Instructions May Cause Unpredictable System Behavior” • “Processor May Experience a Spurious LLC-Related Machine Check During Periods of High Activity” • … • Transient hardware faults (soft errors) – Single event upset (SEU) in SRAM, logic • Due to alpha particle, cosmic radiation – Manifested as software failures • Crashes, wrong output: silent data corruption – Bigger problem in advanced CPU • Increased density, freq higher soft error http://www.cotsjournalonline.com/articles/view/102279 7 Safety Challenges: Hardware • SRAM SER vs. technology scaling – Per-bit SER decreases – Per-chip SER increases (due to higher density) Ibe et al., “Scaling Effects on Neutron-Induced Soft Error in SRAMs Down to 22nm Process” (Hitachi) 8 Security Challenges: Software • Insecure software in CPS safety hazards • Stuxnet: first reported cyber warfare, targeted for Iranian nuclear plants (destroying centrifuges) • Vermont power grid hack by Russia • Remote hack into cars (Zeep) • Police drone hacking • Sensor hacking: GPS spoofing. IMU spoofing 9 Security Challenges: Hardware • Disturbance errors in DRAM (*) • a.k.a. Row Hammer Bug • Repeated opening/closing a DRAM row can cause bit flips in adjacent rows. • In more than 80% DRAM modules between 2010 -2013 • Google demonstrated successful hacking method utilizing the bug (**) – manipulate page tables at the user-level (*) Yoongu Kim et al, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors,” ISCA’14 10 (**) Google Project Zero. Exploiting the DRAM rowhammer bug to gain kernel privileges, 2015 DRAM Chip Row of Cells Wordline VictimRow Row AggressorRow OpenedClosed Row VHIGHLOW VictimRow Row Row Repeatedly opening and closing a row ind uces disturbance errors in adjacent rows This slide is from the Dr. Yoongu Kim’s ISCA 2014 presentation 11 How to Improve Safety of a System? • Correct by design – Formal method based software development • Difficult for a complex system – Use reliable hardware • e.g., radiation hardened processors • Expensive and low performance • Deal with failures – Run-time monitoring and redundancy 12 This Week: Fault Tolerance • A Simplex Architecture for Intelligent and Safe Unmanned Aerial Vehicles, RTCSA16 • Application and System-Level Software Fault Tolerance Through Full System Restarts, ICCPS'2017 (optional) • SAFER: System-level Architecture for Failure Evasion in Real-time Applications. RTSS’12 • ROS: an open-source Robot Operating System, ICRAOSS'09 (optional) 13 Next Week: Security • Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors. ISCA, 2014 • Drammer: Deterministic Rowhammer Attacks on Mobile Platforms, CCS'16 [blog] (optional) • Drone hack: Spoofing attack demonstration on a civilian unmanned aerial vehicle. GPS World, August 2012. • Comprehensive Experimental Analyses of Automotive Attack Surfaces, USENIX Security, 2011 (optional) • Rocking Drones with Intentional Sound Noise on Gyroscopic Sensors, USENIX security’15 (optional) 14 Fault Tolerance • Goal: Logical correctness • Threats – Computer systems • Software bugs • Hardware bugs – Physical systems • Sensors inaccuracies • Actuator malfunctions • Approaches – Redundancy • TMR, n-versioning, – Hardening 15 RowHammer 16.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us