COMS 6500 Notes

COMS 6500 Notes Ashlin Harris 15 September 2016 1 FORTRAN Homework 1.1 Quiz 1. Name the five intrinsic data types. integer - number without a fraction or complex component real - floating-point, stored as the nearest machine number complex - number with a complex component character - interpreted as an ASCII symbol logical - boolean; can equal .true. or .false. 2. Write a simple program that makes an integer named answer, asigns it the value 42 and prints it to the screen. program answer to life implicit none integer :: answer answer = 42 print *,answer end program answer to life 3. Write the command to compile your program. gfortran file name.f90 -o executable file 4. Write the command to execute your program. ./executable file 2 Reading Assignment: Code Complete Dr. Carroll intentionally chose not to give a quiz today. 1 3 Pseudocode Development is never a straightforward process. After all, design is a wicked problem. It involves checking and redesign, even after code is written, so its steps are not sequential. Code can be constructed from individual routines, parts of a program that each do one set task. As you construct a routine, consider the following questions1: • What does it do? • What shouldn't it do? • What is its interface? • What are its time and space conditions? • What does it depend on? • What does it affect? Notice that all these questions are best answered in plain English. When writing a routine, it is recommended to first write in pseudocode before writing in a coding language. Pseudocode refers to a high-level description of principles or portions of a computer program. Writing an algorithm in pseudocode facilitates good design principles from the start. Rather than writing the specific machine instructions, you should express concepts in an informal, natural, human-readable language. Phrases in plain English are preferred, since these will represent the intended behavior rather than a potentially faulty implementation. Unless you were raised with a computer language as your native tongue, this will reduce errors of tranlation. Pseudocode can be implemented in a variety of languages. The level of detail in the pseudocode is increased until implementation in code can be done directly from the pseudocode. Pseudocode should not include actual code. This makes the routine more portable, since what is written can be implemented in any of a number of computer languages. It also ensures changes are made when the project is in the most malleable state and avoids wasted effort. Moreover, designers may develop an emotional attachment to their written code, even if their work isn't the best solution, so it's best to start right. Pseudocode is easier to read and maintain and can be understood without the need to learn a particular computer language. It enforces a top-down approach to coding. Since concepts must be written down in plain language, it allows a routine to be specified in a clear way and thus reduces the opportunity for errors to arise in the implementation. Finally, pseudocode can be repurposed as comments in the final program. 1As a general rule, when writing and testing code, write the test first. If you know what is expected, you can more easily craft a solution. 2 So, pseudocode can describe at a high level what a developer wants to achieve. It is condensed, so one sentence could be a page of code. It should be refined and kept up to date; this tends to be an easier task than modifying the actual code on its own. 4 FORTRAN 4.1 General rules • Begin your programs with implicit none. This prevents implicit assignment of certain common variable names. Unless you know what variables are affected, implicit assignment may lead to unwanted behavior in your program.2 For instance, what is intended to be a real number may be assigned as an integer and be subject to unwanted truncation. • Initialize every variable you use. It's just good defensive programming. Some languages will use data locations without clearing out the contents, which can lead to undefined behavior. • Multiple variables of the same type can be initialized on the same line. • Use int, parameter for a constant integer. If your variable shouldn't change, make changing it impossible. • Use descriptive variable names to improve readability. Single-letter variable names should only be used for mathematical objects and one-off it- erators. 4.2 Variable storage RAM was at a premium when FORTRAN was developed, so the programmer has fine control over size of variables. Integers are the easiest to understand and assign space in memory. Reals can be assigned space using a FORTRAN built-in function selected real kind(man, exp), in which man and exp are the base-10 precision of the mantissa and exponent, respectively. FORTRAN will calculate how many bytes are needed to store the variable. 4.3 Intrinsic data types FORTRAN is a strongly typed language, in contrast to languages you have already seen such as python and Bash. The results of operations may differ depending on how variables are initialized. For instance, consider the following program: 2A friend of Dr Carroll once discovered a $92 trillion dollar credit in a PayPal account. This may have been caused by an unintended negative number represnted as an unsigned number. Whether this is an example of unwanted program behavior is a matter of perspective. 3 program integer vs real implicit none integer :: i real :: r i = 10.5 r = 10.5 print *, i print *, r print *, i/3 print *, r/3 end program integer vs real Obviously, an integer type cannot store a fractional component. So, when i is assigned a value of 10.5, the fraction will be truncated and i will have a value of 10. More surprising is the second set of results, unless you are familiar with integer division. In most languages, when two integers are divided, the result is another integer. So, 10/3 will return a value of 3. 4.3.1 Integer Integers are numbers with no fractional or imaginary component. They are typically 4 or 8 bytes in size. 4.3.2 Real The IEEE Standard for Floating-Point Arithmetic (IEEE 754) sets the most widely used floating-point standard, one that originally struck a balance between the ease of its hardware implementation and the usefulness of the number representation. The machine representation of a floating-point was divided into 3 parts: sign bit - 0 if the number is positive, 1 if negative significand - the most significant digits of the number; correspnds to the mantissa exponent - indicates the location of the decimal point (hence the name floating- point 4 A single precision floating-point occupies 4 bytes, or 32 bits; the bit sequence is allotted as 1 sign bit, followed by 23 significand bits, then 8 exponent bits. A double precision floating-point occupies 8 bytes, or 64 bits; the bit sequence is allotted as 1 sign bit, followed by 53 significand bits, then 10 exponent bits. Both types store real numbers in a similar way. As an example, take the number π. The integer portion bπc = 3 is easily expressed in binary (310 = 112). In base-10, numbers after the decimal represent 10−1; 10−2; 10−3;:::. Similarly, numbers after the decimal represent 2−1; 2−2; 2−3;::: in base-2. So, π can be expressed as 11:001001000011111101101010100010 ::: The number is positive, so there is no negative sign. The sign bit is therefore 0. Numbers are normalized (as in scientific notation) so that the mantissa is at least 1 but less than 102. So π can be normalized as 1:1001001000011111101101010100010 ::: × 101: In binary, the only digits possible are 0 and 1, so any number can be expressed with 1 at the front without loss of significant digits. To save space and thereby increase precision, the first 1 is assumed in the mantissa and is not included in the significand. Still, the significand only has room for 23 bits, and the final bit is rounded off. Most real numbers cannot be represented exactly, so they are represented as the nearest machine number. So, the significand of π is actually recorded as 10010010000111111011011 As for the exponent (1 in this case), it must also be recorded. The exponent for a normalized number could potentially be negative, but there is no sign bit 3 for the exponent. Instead, the exponent is biased by 12710, or 011111112. So, an exponent field of 00000000 will be interpreted as−12710, 01111111 will be interpreted as 0, and in our case, the exponent 1 will be recorded as 10000000. As a whole, π is stored as 10010010000111111011011 10000000 If you want your program to conform to IEEE 754, you could declare a single precision floating-point using a FORTRAN built-in function like so: integer, parameter :: sp = selected real kind(7,38) real (kind=sp) :: var 3The bias makes comparison and conversion of exponents simpler for the machine. In general, the bias is 2k−1 − 1, where k is the number of bits in the exponent field 5 The first line allocates a real with single precision to a whole number of bytes. The variable sp refers to the number of bytes, as calculated by FORTRAN. Under IEEE 754, FORTRAN reserves some reals as special numbers. These include +0 and -0, denormalized and NaN, and Inf and -Inf. These reserved values help reduce program failure and make error handling more consistent. 4.3.3 Character Character variables can store one or more ASCII characters. They can be de- clared with lines such as character (len=7):: lastname='Jones', in which len is the maximum length of the string in characters.

Load more