Integer Arithmetic and Undefined Behavior in C Brad Karp UCL Computer Science

Carnegie Mellon Integer Arithmetic and Undefined Behavior in C Brad Karp UCL Computer Science CS 3007 23rd January 2018 (lecture notes derived from material from Eddie Kohler, John Regehr, Phil Gibbons, Randy Bryant, and Dave O’Hallaron) 1 Carnegie Mellon Outline: Integer Arithmetic and Undefined Behavior in C ¢ C primitive data types ¢ C integer arithmetic § Unsigned and signed (two’s complement) representations § Maximum and minimum values; conversions and casts § Perils of C integer arithmetic, unsigned and especially signed ¢ Undefined behavior (UB) in C § As defined in the C99 language standard § Consequences for program behavior § Consequences for compiler and optimizer behavior § Examples § Prevalence of UB in real-world Linux application-level code ¢ Recommendations for defensive programming in C 2 Carnegie Mellon Example Data Representations C Data Type Typical 32-bit Typical 64-bit x86-64 char 1 1 1 signed short (default) 2 2 2 and int 4 4 4 unsigned variants long 4 8 8 float 4 4 4 double 8 8 8 pointer 4 8 8 3 Carnegie Mellon Portable C types with Fixed Sizes C Data Type all archs {u}int8_t 1 {u}int16_t 2 {u}int32_t 4 {u}int64_t 8 uintptr_t 4 or 8 Type definitions available in #include <stdint.h> 4 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ UnDefineD Behavior (on which more shortly…) § Shift amount < 0 or ≥ worD size 5 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ UnDefineD Behavior (on which more shortly…) § Shift amount < 0 or ≥ worD size 6 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ UnDefineD Behavior (on which more shortly…) § Shift amount < 0 or ≥ worD size 7 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ UnDefineD Behavior (on which more shortly…) § Shift amount < 0 or ≥ worD size 8 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ UnDefineD Behavior (on which more shortly…) § Shift amount < 0 or ≥ worD size 9 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ UnDefineD Behavior (on which more shortly…) § Shift amount < 0 or ≥ worD size 10 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ UnDefineD Behavior (on which more shortly…) § Shift amount < 0 or ≥ worD size 11 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ Undefined Behavior (on which more shortly…) § Shift amount < 0 or ≥ word size 12 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ Undefined Behavior (on which more shortly…) § Shift amount < 0 or ≥ word size 13 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ Undefined Behavior (on which more shortly…) § Shift amount < 0 or ≥ word size 14 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ Undefined Behavior (on which more shortly…) § Shift amount < 0 or ≥ word size 15 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ Undefined Behavior (on which more shortly…) § Shift amount < 0 or ≥ word size 16 Carnegie Mellon Shift Operations ¢ Left Shift: x << y Argument x 01100010 § Shift bit-vector x left y positions << 3 00010000 – Throw away extra bits on left Log. >> 2 00011000 § Fill with 0’s on right Arith. >> 2 00011000 ¢ Right Shift: x >> y § Shift bit-vector x right y positions Argument 10100010 § Throw away extra bits on right x § Logical shift << 3 00010000 § Fill with 0’s on left Log. >> 2 00101000 § Arithmetic shift Arith. >> 2 11101000 § Replicate most significant bit on left ¢ Undefined Behavior (on which more shortly…) § Shift amount < 0 or ≥ word size 17 Carnegie Mellon Integer Numeric Ranges 18 Carnegie Mellon Integer Numeric Ranges ¢ Unsigned Values § UMin = 0 000…0 § UMax = 2w – 1 111…1 19 Carnegie Mellon Integer Numeric Ranges ¢ Unsigned Values ¢ Two’s Complement Values § UMin = 0 § TMin = –2w–1 000…0 100…0 § UMax = 2w – 1 § TMax = 2w–1 – 1 111…1 011…1 § negative 1 111…1 20 Carnegie Mellon Integer Numeric Ranges ¢ Unsigned Values ¢ Two’s Complement Values § UMin = 0 § TMin = –2w–1 000…0 100…0 § UMax = 2w – 1 § TMax = 2w–1 – 1 111…1 011…1 § negative 1 111…1 Values for word size W = 16 Decimal Hex Binary UMax 65535 FF FF 11111111 11111111 TMax 32767 7F FF 01111111 11111111 TMin -32768 80 00 10000000 00000000 -1 -1 FF FF 11111111 11111111 0 0 00 00 00000000 00000000 21 Carnegie Mellon Integer Ranges for Different Word Sizes W 8 16 32 64 UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615 TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807 TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808 ¢ Observations ¢ C Programming § |TMin | = TMax + 1 § #include <limits.h> § Asymmetric range § Declares constants, e.g., § UMax = 2 * TMax + 1 § ULONG_MAX § LONG_MAX § LONG_MIN, etc.

Integer Arithmetic and Undefined Behavior in C Brad Karp UCL Computer Science

Bit Shifts Bit Operations, Logical Shifts, Arithmetic Shifts, Rotate Shifts

Create Union Variables Difference Between Unions and Structures

Undefined Behaviour in the C Language

Defining the Undefinedness of C

Data Structures, Buffers, and Interprocess Communication

The Art, Science, and Engineering of Fuzzing: a Survey

Bits, Bytes and Integers ‘

Fixed-Size Integer Types Bit Manipulation Integer Types

Advanced Programming Techniques (Amazing C++) Bartosz Mindur Applied Physics and Computer Science Summer School '20

Malloc & Sizeof

Mallocsample from Your SVN Repository

A Cross-Platform Programmer's Calculator