<<

Table 1: Values Represented by Bit Patterns in IEEE Single Format Single-Format Bit Pattern Value 0 < e < 255 (-1)s × 2e-127 × 1.f (normal numbers) s e = 0; f =6 0 (at least one bit in f is nonzero) (-1) × 2-126 × 0.f (subnormal numbers) s e = 0; f = 0 (all bits in f are zero) (-1) × 0.0 () s = 0; e = 255; f = 0 (all bits in f are zero) +INF (positive infinity) s = 1; e = 255; f = 0 (all bits in f are zero) -INF (negative infinity) s = u; e = 255;f =6 0 (at least one bit in f is nonzero) NaN (Not-a-Number)

Bit Patterns in Single-Storage Format and their IEEE Values Common Name Bit Pattern (Hex) Decimal Value +0 00000000 0.0 -0 80000000 -0.0 1 3f800000 1.0 2 40000000 2.0 maximum normal number 7f7fffff 3.40282347e+38 minimum positive normal number 00800000 1.17549435e-38 maximum subnormal number 007fffff 1.17549421e-38 minimum positive subnormal number 00000001 1.40129846e-45 +∞ 7f800000 Infinity −∞ ff800000 -Infinity Not-a-Number 7fc00000 NaN Table 2: Values Represented by Bit Patterns in IEEE Double Format Double-Format Bit Pattern Value 0 < e < 2047 (-1)s × 2e-1023 x 1.f (normal numbers) s e = 0; f =6 0 (at least one bit in f is nonzero) (-1) × 2-1022 x 0.f (subnormal numbers) s e = 0; f = 0 (all bits in f are zero) (-1) × 0.0 (signed zero) s = 0; e = 2047; f = 0 (all bits in f are zero) +INF (positive infinity) s = 1; e = 2047; f = 0 (all bits in f are zero) -INF (negative infinity) s = u; e = 2047; f =6 0 (at least one bit in f is nonzero) NaN (Not-a-Number)

Bit Patterns in Double-Storage Format and their IEEE Values Common Name Bit Pattern (Hex) Decimal Value + 0 00000000 00000000 0.0 - 0 80000000 00000000 -0.0 1 3ff00000 00000000 1.0 2 40000000 00000000 2.0 max normal number 7fefffff ffffffff 1.7976931348623157e+308 min positive normal number 00100000 00000000 2.2250738585072014e-308 max subnormal number 000fffff ffffffff 2.2250738585072009e-308 min positive subnormal number 00000000 00000001 4.9406564584124654e-324 +∞ 7ff00000 00000000 Infinity −∞ fff00000 00000000 -Infinity Not-a-Number 7ff80000 00000000 NaN Table 3: Double-Extended Bit Pattern () Value j = 0, 0

y = 838861.2; z = 1.3;

printf("y: %18.11f\n", y); printf("z: %18.11f\n", z);

return 0; }

The output from this program should be similar to:

y: 838861.18750000000 z: 1.29999995232 Range and Precision of Storage Formats Format Sig Digits (Binary) Smallest Pos Largest Pos Sig Digits (Decimal) single 24 1.175... 10-38 3.402... 10+38 6-9 double 53 2.225... 10-308 1.797... 10+308 15-17 double extended (x86) 64 3.362... 10-4932 1.189... 10+4932 18-21 double extended (x86 64) 113 3.362... 10-4932 1.189... 10+4932 33-36 Standards: POSIX, BSD 4.3, ISO 9899 acos arccosine, returns value in [0, π] asin arcsine, returns value in [−π/2,π/2] atan arctangent, returns value in [−π/2,π/2] takes y and x to break degeneracy in atan(y/x) ceil smallest integral value not less than x cos Cosine cosh Hyperbolic cosine exp Exponentiate fabs of floating-point number floor largest integral value not greater than x fmod floating-point remainder frexp convert floating-point number to fractional and integral components ldexp multiply floating-point number by integral power of 2 log Natural log log10 Log base ten modf extract signed integral and fractional values from floating-point number pow Raise number to a power sin sinh Hyperbolic sine sqrt of a number tan Tangent tanh Hyperbolic tangent