Comp.211{} Computer Application

Contents Introduction to Computers ...... 2 Components of Computer ...... 2 System Concept: ...... 3 Classification of Computers ...... 3 1. Classification based on Computer Generations ...... 3 2. Classification based on Computer Size ...... 4 3. Classification based on Computer Technology ...... 5 4. Classification based on Purpose...... 5 Types of Memories ...... 6 Input Output Devices ...... 7 1) Output devices ...... 7 1.1) Screen ...... 7 1.1.1) CRT (Cathode Ray Tube) ...... 7 1.1.2) LCD (Liquid Crystal Display) screen ...... 8 1.2) Printers ...... 8 1.2.1) Character printers ...... 8 1.2.2) Line printers ...... 9 1.2.3) Page printers – LASER Printers ...... 10 1.3) Plotters ...... 12 2) Input devices ...... 12 2.1) Keyboard ...... 12 2.2) Mouse ...... 12 2.3) Scanners ...... 12 2.4) Magnetic Ink Character Reader (MICR) ...... 13 Storage and representation of numbers ...... 14 sign-and-magnitude...... 14 ones' complement, ...... 14 two' complement ...... 14 EXCESS- ...... 15 Calculating two's complement ...... 15 Alternative conversion process ...... 16 INFORMATION STORAGE CODES ...... 17 ASCII ...... 17 Binary-coded decimal ...... 18 EBCDIC ...... 19 ...... 20 Programming Languages ...... 22 Machine Language ...... 22 Assembly language ...... 22 High level languages ...... 23

Page 1 of 23

Comp.211{R} Computer Application

Introduction to Computers The name comes from the word compute, which depicts the core development issue of computers- to develop a very fast calculating machine.

Def. Computer is an electronic device that operates upon information. A computer can store, process and retrieve data as and when required. The working of computer is limited to the program stored in it. It is alternatively known as Data Processor

Characteristics of Computers: 1. Speed: units of measuring speed of computer are µs (10-6) nanoseconds (10-9) and picoseconds (10- 12). In other words a computer can perform around 3-4 million simple operations per second.

2. Accuracy: Accuracy of a computer is very high and degree of memory depends on its design (hardware and software design). Errors if any are generally because of faulty programming, input/output error or by erroneous data entered by humans.

3. Diligence: Computers are very hardworking and doesn’ suffer from monotony, tiredness or lack of coordination. It can work for hours on a same task or different tasks.

4. Versatility: A computer can perform any task that can be reduced to logical steps. It can do multiple tasks (in such a short time that a person cannot figure out when it completed.

5. Power of remembering: can recall/store any type of information that can be represented in form of binary data. This information can be important or general information. It can be stored for infinite time because of presence of secondary storage devices. This information can be retrieved any time and the amount of storage available is infinite.

6. No I..: A computer can’t do anything of its own. A computer can perform only those tasks which it is programmed to do.

7. No feelings: The emotional feelings are absent in computers.

Components of Computer There are five basic components of computer. The classification is made on the basis of basic operations performed by the computer. 1. Input: Enter data and instructions. 2. Storing: Save data and instructions to retrieve them when they are required. 3. Processing: perform arithmetic and logical operations on data to produce information. 4. Output: Present information in human readable form. 5. Controlling: Directing the manner or sequence in which all operations are performed. The architectural designs are different on different types of computers (like of Intel, Macintosh etc) but the organization of all computers is generally similar. There are five units: 1. Input Unit: Convert data from human readable form to machine readable form. It is done by input interface. 2. Output Unit: is conversion from binary (machine readable) to human readable form. It is done by the output interface. 3. Memory Unit: This unit works for the storage of data. It is responsible for storing all data to be processed and the instructions to be executed. It also stores intermediate results and the final results before sending them to the output device. 4. Arithmetic and Logical Unit: All the arithmetic and logical operations are performed

Page 2 of 23

Comp.211{R} Computer Application

5. Control Unit: Control unit coordinates and controls all the operations that are performed by the computer. It doesn’t perform any actual processing. It acts as the central nervous system. ALU + CU = Central Processing Unit

System Concept: A system is a group of integrated parts that work in accordance with one another for completion of common goals. There are three characteristics of a system 1. There is more that one element. 2. All elements are logically related/interdependent on each other. 3. All elements are controlled in such a way that the system goal is achieved.

Classification of Computers The computers can be classified according to 1. Computer generations. 2. Size of computers. 3. Technology Type. 4. Purpose 1. Classification based on Computer Generations  Ist Generation Computers  Machines that were developed on John Van Neumann architecture.  1942-1955  Vacuum tubes were used.  Programming language was machine language or assembly language.  Slow speed  High power consumption.  Huge space required.  Poor programming capabilities.  ENIAC  2nd Generation Computers  Transistors were used instead of vacuum tubes. Page 3 of 23

Comp.211{R} Computer Application

 1950-1960.  Introduction of stored program concept.  New industry called the software industry was emerging.  Memory was around 100KB.  High level languages like COBOL and FORTRAN were developed.  IBM 1620, IBM 1401  3rd Generation Computers  Use of Integrated Circuits & hence the size was reduced.  1964-1975  Concept of an Operating system was emerging.  Memory was upto 4MB  They were also called as MINI computers.  Intelligent compilers and multiprogramming concepts were implemented.  Solid state memories (diode and transistors) were used.  IBM 360.  4th Generation computers  Very Large Scale Integration (VLSI) around 50,000 transistors on a single chip.  1975 onwards.  Improvements in , memories and compilers.  Creation of user friendly application like word processing and spreadsheets.  Improved multiprocessing capabilities.  PCs were introduced in 1980s to be used in homes and schools.  Later years marked the emergence of concept of Networking.  5th Generation computers  AI, Expert systems etc & are yet of come.

2. Classification based on Computer Size Basic parameters are:  Speed: the number of instructions executed in one second. It is in order of kilo instructions per second or million instructions per second.  Word size or Word Length: It is the basic unit to store information. It is usually measured in terms of bits or . Longer the word length larger is the size.  Size of main memory: It is the addressable part of main memory. It is that part of memory which is available to user. It is expressed as kilobytes (KB) or Million Bytes (MB).  Peripheral devices and software support: input and output devices used.  Software support: programs and applications supported.

Name Speed Word Mem. Peripherals Application areas Machines length Size Microcomputer 100kips 8 to 16 64kb - Floppy, keyboard, Word processing, IBM PC bits 640kb video terminals graphics etc Apple Mini 500kips 16bits 256KB Hard disk, magnetic Engg. & scientific IBM Computers - 1MB tapes, plotters and line research, process SYS/3 printers control in factories Main Frame 1mips 32 to 2- 1000MB Hard disk, Central host IBM computers 60 bits 128MB faster line printers, computers in 370/168 Page 4 of 23

Comp.211{R} Computer Application

mini computers as distributed systems, front-end processors complex engg. Designs Super 100mips 64-96 8- 1000MB Hard disk, Weather prediction, CRAY- II computers bits 512MB faster laser printers, design of complex CYBER- mini computers as machines, space 205 front-end processors research

3. Classification based on Computer Technology  Digital Computers  Works with digits & hence it is a counting device.  Binary language is used & its circuits are designed to interpret binary.  Information is fed in terms of discrete electrical signals.  Analog Computers  Works by measuring voltage & current rather than counting.  It takes continuous input and produces continuous output.  It is used for solving any problem that can be represented as a differential equation. It accuracy is limited but it has very fast speeds.  Hybrid Computers  It combines best features of both technologies. .. MODEM  It is capable of processing both analog & digital data.  Generally used for scientific research & industrial control applications. 4. Classification based on Purpose  Special Purpose Computers  The computer is designed for a particular type of application only. These special purpose computer systems are meant to handle a very narrow category of data processing activities. Eg. Process control computers in industries, Air Traffic controllers, Robots etc. It is therefore defined as a stored program digital computer whose architecture is oriented towards one or more specific applications.  General Purpose Computers  Is designed to solve a wide variety of problems from different fields. The hardware and the software of such systems are adaptable for totally different environments. Most of the computers used today are general purpose computers.

Page 5 of 23

Comp.211{R} Computer Application

Types of Memories The storage unit ranked according to the following criteria: Access time The time required to access the contents of the memory. Generally it is desired that the access time should be minimal. Storage Capacity It is the amount of data that can be stored in the storage unit. A large capacity is desired. Cost per bit of storage This cost should be minimal.

Based on these criteria, the memory is classified as primary and secondary. The primary memory has less access time, smaller storage capacity & higher cost per bit of storage. The classification of various types of memories (primary memory) generally depends on the ability to write in the memory and is not based on the mode of accessing the contents of the memory. This is due to the fact that all primary memories are Random Access Memories and they don’t involve any mechanical motion to access the data. Random access refers to a property that time to access nth location is independent of n. In general terms time to access 1st memory location is same as that of any other location. (Compare it with an audio cassette, is the time required to access first song on a cassette is same as that of 4th song? ) ) Different types of primary memories are:

RAM: Random Access Memory: Temporary storage, the contents are washed off after the power goes off. It is writable memory and can be easily erased.

ROM: Read Only Memory: It is a permanent memory and the contents cannot be altered in any way. It is not erased when the power is switched off. Since the contents of this memory are not lost, they are used to store the micro programs. These are the very small programs that are used to perform low level machine operations.

PROM: Programmable Read Only Memory: The ROM chips are used to store the micro programs, but they cannot be altered. One modification in this scenario is that provide a user with an option to customize the micro-programs (or data). In other words, this means that a One time write facility is provided in these chips. Once written they function in the similar way like ROM chips do.

EPROM: Erasable & Programmable Read Only Memory: The information in a ROM or a PROM chip cannot be altered. There are however one type of chips whose contents can be erased. These types of memories which are permanent in nature and yet can be erased is canned EPROM. Generally these are erased by exposing the chip to Ultraviolet light.

Cache Memory: The speed of CPU is very high as compared to memory. Hence there is a mismatch in the speeds of CPU and memory. This causes the CPU to work with a reduced speed (approximately to that of memory). A work around for this situation is to have memory whose speeds match the speed of CPU. This memory is called cache memory and is very expensive. They are generally in the size of a few kilo-bytes. This memory is used internally by the system and acts as a very high speed buffer. It is not possible for the user to use this memory (or view its contents) from a program.

Registers: There is a movement of data and instructions from memory to CPU and vice versa. This is done with the help of special memory called registers. CPU uses data and instructions from special memory called Registers. They are not the part the main memory and retain information on a temporary basis. The length of register is equal to the number of bits it can store. There are generally: 8-16-32 bit registers. There are specific registers for specific purpose: Memory Address Register (MAR): It contains the address of current memory location. Memory Buffer Register (MBR): It holds the contents of current memory location just read or to be written into. Program Counter (PC): This register contains the address of next program instruction. This register is normally incremented as the program executes sequentially but transfer to control (or jump statements) modify its contents. Accumulator Register (AR): This register holds the initial (or intermediate) data used in various calculations. E.g. If we are calculation result = a + * AR will hold an intermediate result of b*c onto which a will be subsequently added. Instruction Register (IR): This holds the current instruction being executed by the CPU. Input-Output Register (I/ Register): It holds the data read from or to be written to the I/O devices.

Page 6 of 23

Comp.211{R} Computer Application

Input Output Devices

1) Output devices Various output devices are 1.1) Screen A screen is the most common output devices. It is also known as a standard output device. There are two types of screens based on the technology.

1.1.1) CRT (Cathode Ray Tube) A beam of electrons emitted by an electron gun passes through focusing and deflection systems that direct the beam toward specified positions on the phosphors-coated screen. The phosphors then emit a small spot of light at each position contacted by the electron beam. Because the light emitted by phosphors fades very rapidly, some method is needed for maintaining the screen picture repeatedly by quickly directing the electron beam back over the same points. This type of display is called refresh CRT.

Different kinds of phosphors are available for use in CRT. Besides color a major difference between phosphors is persistence: how long they continue to emit light after the electron beam is removed. It is defined as the time it takes the emitted light from the screen to decay to one tenth of its original intensity. This means that high persistence phosphors needs smaller refresh rates and vice versa.

1. Electron guns 2. Electron beams 3. Focusing coils 4. Deflection coils 5. Anode connection 6. Mask for separating beams for red, green, and blue part of displayed image 7. Phosphor layer with red, green, and blue zones 8. Close-up of the phosphor-coated inner side of the screen

Cutaway rendering of a color CRT (source http://en.wikipedia.org/wiki/Cathode_ray_tube)

The maximum number of points that can be displayed without an overlap on a CRT is referred to as the resolution. A more precise definition is the number of points per centimeter that can be plotted horizontally and vertically, although it is often states as total number of points in each direction. Another property of CRT monitors is aspect ratio. This number gives the ratio of vertical points to horizontal points necessary to produce equal length lines in both directions on the screen. An aspect ratio of 3/4 means that a vertical line plotted with three points has the same length as a horizontal line plotted with four points. Color Monitors use a combination of phosphors that emit different-colored light. By combining emitted light from different phosphors, a range of colors can be generated.

Two different approaches for implementing color CRT are Beam Penetration method:

Page 7 of 23 Comp.211{R} Computer Application

Two layers of phosphors usually red and green are coated onto the inside of the CRT screen and the displayed color depends on how deep the electron beam penetrates into the phosphors layers. A beam of slow electrons excites on the outer red layer. A beam of faster electrons penetrates the red layer and excites the inner green layer. Intermediate intensities of electron beam produce orange and yellow colors. This is simple and inexpensive method of producing colors but its disadvantage is that only four colors are possible. It was widely used in the earlier monitors. Shadow Mask: A shadow mask CRT has three phosphors dots at each pixel position. One phosphors dot emits red color (others blue and green). This has three electron guns one for each color and a shadow mask grid just behind the phosphors screen. When the three beams pass through a hole in the shadow mask they activate a dot triangle which appears as a small color spot on the screen. The phosphors dots in the triangles are arranged so that each electron beam can activate only its corresponding color dot when it passes through the shadow mask.

1.1.2) LCD (Liquid Crystal Display) screen LCD are non emissive devices produce a picture by passing polarized light from the surroundings or from an internal light source through a liquid-crystal material that can be aligned to either block or transmit the light. The term liquid crystal refers to the fact that these compounds have a crystalline arrangement of molecules yet they flow like a liquid. 1.2) Printers Printers are the most commonly used output devices that can be found attached with almost all computers. They are primary output devices to convert documents into human readable form as a hard copy. There are several types of printers designed for several types of applications. Depending on their speed and approach of printing they are classified as

1.2.1) Character printers These printers print only one character at a time. A typewriter is an example of character printing device. Different character printers are: 1.2.1.1) Daisy Wheel Printer: These printers use a printwheel font known as a daisy wheel. Each petal of the daisy wheel has a character embossed on it. A motor spins the daisy wheel at a rapid rate. When the desired character spins to the correct position a print hammer strikes it to produce the output. Speed: 10-50 cps (character per second). By the mid-1980s daisy wheel technology was rapidly becoming obsolete due to the growing spread of affordable laser and inkjet machines, and daisy wheel machines soon disappeared except for the small remaining typewriter market.

1.2.1.2) Dot-Matrix Printer: These printers print the characters as a pattern of dots. The print head comprises a matrix of tiny needles,

Page 8 of 23 Comp.211{R} Computer Application

typically 9 7, which hammers out characters in the form of patterns of tiny dots. The print quality of DMP is inferior to Daisy Wheel Printer but these are generally faster. Speed 40-250 cps.

They are also less expensive and are capable of printing low quality graphics in addition to text. The font size is also not fixed here.

1.2.1.3) Ink-Jet Printers: These are non impact character printers that print characters by spraying small drops of ink onto paper. Special type of ink are electrically charged after leaving a nozzle. The droplets are then guided by electrically charged deflection plates. They produce high quality output because the characters are formed by dozens of tiny ink dots. They don’t make noise and are also capable of printing in different colors. a) Thermal Inkjet (BubbleJet): These inkjet printers use print cartridges with a series of tiny electrically heated chambers constructed in accordance with the image to be printed. To produce an image, the printer runs a pulse of current through the heating elements causing a steam explosion in the chamber to form a bubble, which propels a droplet of ink onto the paper (hence Canon's tradename of Bubblejet for its inkjets). The ink's surface tension as well as the condensation and thus contraction of the vapour bubble, pulls a further charge of ink into the chamber through a narrow channel attached to an ink reservoir. b) Piezoelectric inkjets: Most commercial and industrial ink jet printers use a piezoelectric material in an ink-filled chamber behind each nozzle instead of a heating element. When a voltage is applied, the piezoelectric material changes shape or size, which generates a pressure pulse in the fluid forcing a droplet of ink from the nozzle. This is essentially the same mechanism as the thermal inkjet but generates the pressure pulse using a different physical principle. Piezoelectric ink jet allows a wider variety of inks than thermal or continuous ink jet but the print heads are more expensive.

1.2.2) Line printers These are impact printers used with computers with high volume of printing. Their printing speeds are such that to an observer they appear to be printing a line at a time. Drum printers and chain printers are two most commonly used line printers. Speed 300-2500 lines per minute 1.2.2.1) Drum Printers A drum printer consists of a solid cylindrical drum that has raised characters in bands on its surface. There are as many bands as possible as there are printing positions. The drum rotates at a rapid speed. A hammer is located behind the paper. The hammer strikes the paper along with the inked ribbon against the paper that prints the characters. One revolution of the drum is required to print an entire line. This means that all characters in the line are not printed at a time but since the speed of printing is so fast that it appears that it is printing the entire line.

1.2.2.2) Chain Printers They use a rapidly moving chain called the print chain. Each link of the chain is a character font. There is a print hammer is located behind which strikes the chain and the chain is pressed against the paper with ribbon in between it. The chain rotates and when the appropriate character appears in front of hammer, the hammer strikes it. In order to increase the speed of printer the character patters are repeated several times so that it is not required to wait for the chain to make a complete revolution.

Page 9 of 23 Comp.211{R} Computer Application

1.2.3) Page printers – LASER Printers These are very high speed non impact printers which can produce documents at speeds of over 20000 lines per minute. A common example is a photocopy machine. These printers use electro- photographic techniques that were used in designing of paper copier machines. A laser beam projects an image of the page to be printed onto an electrically charged rotating drum coated with selenium. Photoconductivity removes charge from the areas exposed to light. Dry ink (toner) particles are then electrostatically picked up by the drum's charged areas. The drum then prints the image onto paper by direct contact and heat, which fuses the ink to the paper. The laser printer was invented at Xerox in 1969 by researcher Gary Starkweather. The steps in printing are 1) Raster image processing (Generating the raster image data): Each horizontal strip of dots across the page is known as a raster line or scan line. Creating the image to be printed is done by a Raster Image Processor (RIP), typically built into the laser printer. The RIP uses the page description language(Adobe PostScript or Printer Command Language PCL) to generate a bitmap of the final page in the raster memory. Once the entire page has been rendered in raster memory, the printer is ready to begin the process of sending the rasterized stream of dots to the paper in a continuous stream.

2) Charging: A corona wire (in older printers) or a primary charge roller projects an electrostatic charge onto the photoreceptor (otherwise named the photoconductor unit), a revolving photosensitive drum or belt, which is capable of holding an electrostatic charge on its surface while it is in the dark.

3) Exposing: The laser is aimed at a rotating polygonal mirror, which directs the laser beam through a system of lenses and mirrors onto the photoreceptor. The beam sweeps across the photoreceptor at an angle to make the sweep straight across the page; the cylinder continues to rotate during the sweep and the angle of sweep compensates for this motion. The stream of rasterized data held in memory turns the laser on and off to form the dots on the cylinder. Lasers are used because they generate a narrow beam over great distances. The laser beam neutralizes (or reverses) the charge on the white parts of the image, leaving a static electric negative image on the photoreceptor surface to lift the toner particles.

Page 10 of 23 Comp.211{R} Computer Application

4) Developing: The surface with the latent image is exposed to toner, fine particles of dry plastic powder mixed with carbon black or coloring agents. The charged toner particles are given a negative charge, and are electrostatically attracted to the photoreceptor where the laser wrote the latent image. Because like charges repel, the negatively charged toner will not touch the drum where light has not removed the negative charge. The overall darkness of the printed image is controlled by the high voltage charge applied to the supply toner. Once the charged toner has jumped the gap to the surface of the drum, the negative charge on the toner itself repels the supply toner and prevents more toner from jumping to the drum. If the voltage is low, only a thin coat of toner is needed to stop more toner from transferring. If the voltage is high, then a thin coating on the drum is too weak to stop more toner from transferring to the drum. More supply toner will continue to jump to the drum until the charges on the drum are again high enough to repel the supply toner. At the darkest settings the supply toner voltage is high enough that it will also start coating the drum where the initial unwritten drum charge is still present, and will give the entire page a dark shadow

5) Fusing (Melting toner into the paper using heat and pressure): The paper passes through rollers in the fuser assembly where heat and pressure (up to 200 Celsius) bond the plastic powder to the paper. One roller is usually a hollow tube (heat roller) and the other is a rubber backing roller (pressure roller). A radiant heat lamp is suspended in the center of the hollow tube, and its infrared energy uniformly heats the roller from the inside. For proper bonding of the toner, the fuser roller must be uniformly hot. The fuser accounts for up to 90% of a printer's power usage. The heat from the fuser assembly can damage other parts of the printer, so it is often ventilated by fans to move the heat away from the interior. The primary power saving feature of most copiers and laser printers is to turn off the fuser and let it cool. Resuming normal operation requires waiting for the fuser to return to operating temperature before printing can begin. If paper moves through the fuser more slowly, there is more roller contact time for the toner to melt, and the fuser can operate at a lower temperature. Smaller, inexpensive laser printers typically print slowly, due to this energy- saving design, compared to large high speed printers where paper moves more rapidly through a high-temperature fuser with a very short contact time.

Page 11 of 23 Comp.211{R} Computer Application

Hollow Roller Radiant Heat Lamp

Backing Roller

6) Cleaning: When the print is complete, an electrically neutral soft plastic blade cleans any excess toner from the photoreceptor and deposits it into a waste reservoir, and a discharge lamp removes the remaining charge from the photoreceptor. 1.3) Plotters A plotter produces hard copy of graphs and designs. There are two types of plotters – drum and flat bed. In a drum plotter the paper on which output has to appear has to be placed over a drum that rotates back and forth to produce vertical motion. The mechanism also consists of one or more pen holders mounted horizontally over the drum. The multi-colored pens can produce output in more than one color. A flat-bed plotter plots on a paper that is spread and fixed over rectangular flatbed table. In this type, normally the paper doesn’t move and the pen-holding mechanism is designed to provide all the motion. You can also insert more than one pen to get different colored outputs. The plot size is restricted by the area of the bed. Plotters are very slow in printing.

2) Input devices Various input devices are 2.1) Keyboard Keyboard is an online device and is used for entering data directly onto a computer. Data is entered by pressing a set of available keys. A few points worth mentioning are: a) It is a character input device. b) No intermediate storage is required to store data as keyboard is able to enter data into computer. c) It allows computer to be used in interactive mode, where the data can be entered in the computer while the program is still running. ) They are economical for handling small and irregular volumes of data. 2.2) Mouse A mouse is a pointing device. It is used in a graphical user environment where user can point to various menus and icons. There are two or three buttons on the mouse and they are used to performing various functions like selection and displaying options. 2.3) Scanners Scanners are basically input devices that are capable of recognizing marks or characters. They are used for direct data entry into the computer. They are used for scanning photographs, printouts or hand made drawings. Optical Character recognition (OCR) & optical mark recognition (OMR) are the main applications of scanners. 2.3.1) OCR An OCR attempts to detect the characters and numbers written on paper. This eliminates the need of typing the text again.

Page 12 of 23 Comp.211{R} Computer Application

2.3.2) OMR These scanners are capable of recognizing a pre-specified type of mark made by pencil or pen. It is generally used in multiple choice examination evaluation where a student marks a correct answer to a question by marking it on a box. The scanner checks for the marks and matches these with a predefined pattern. 2.3.3) Bar Codes Data coded in the form of light and dark lines or bars are known as bar codes. They are used by the retail trade for labeling goods. Bar code reader is a laser beam scanner that emits laser beam onto a bar code and reads the reflections. These are mapped with some code and the data is interpreted. There are different codes used as bar codes but the most common one is UPC (Universal Product Code). The bars are decoded as ten digit number. The first five digits are decoded as the manufacturer information and the last five digits identify the specific product of the manufacturer. 2.4) Magnetic Ink Character Reader (MICR) These are developed to assist the banking industry in processing the tremendous volume of checks being written everyday. They cheque information is written with a special magnetic ink in human readable form. This way the information is both machine readable and human readable. The major MICR fonts used around the world are E-13B and CMC-7. Almost all US, Canadian, UK and Indian checks (cheques) now include MICR characters at the bottom of the paper in the E-13B font. Some countries, including France, use the CMC-7. The characters are first magnetized in the plane of the paper with a North pole on the right of each MICR character. Then they are usually read with a MICR read head which is a device similar in nature to the playback head in an audio tape recorder, and the letterforms' bulbous shapes ensure that each letter produces a unique waveform for the character recognition system to provide a reliable character result

Page 13 of 23 Comp.211{R} Computer Application

Storage and representation of numbers In mathematics, negative numbers in any base are represented in the usual way, by prefixing them with a "−" sign. However, on a computer, there are various ways of representing a number's sign. Generally there are four methods of extending the binary numeral system to represent signed numbers: sign-and-magnitude One may first approach the problem of representing a number's sign by allocating one sign bit to represent the sign: set that bit (often the most significant bit) to 0 for a positive number, and set to 1 for a negative number. The remaining bits in the number indicate the magnitude (or absolute value). Hence in a with only 7 bits (apart from the sign bit), the magnitude can range from 0000000 (0) to 1111111 (127). Thus you can represent numbers from −12710 to +12710 once you add the sign bit (the eighth bit). A consequence of this representation is that there are two ways to represent zero, 00000000 (0) and 10000000 (−0). Decimal −43 encoded in an eight-bit byte this way is 10101011. This approach is directly comparable to the common way of showing a sign (placing a "+" or "−" next to the number's magnitude). Some early binary computers (e.g. IBM 7090) used this representation ones' complement, A system known as ones' complement can be used to represent negative numbers. The ones' complement form of a negative binary number is the bitwise NOT applied to it — the complement of its positive counterpart. Like sign-and-magnitude representation, ones' complement has two representations of 0: 00000000 (+0) and 11111111 (−0). As an example, the ones' complement form of 00101011 (43) becomes 11010100 (−43). The range of signed numbers using ones' complement in a conventional eight- bit byte is −12710 to +12710. To add two numbers represented in this system, one does a conventional binary addition, but it is then necessary to add any resulting carry back into the resulting sum. To see why this is necessary, consider the following example showing the case of the addition of −1 (11111110) to +2 (00000010). '''binary decimal''' 11111110 -1 + 00000010 +2 ...... 1 00000000 0 <-- not the correct answer 1 +1 <-- add carry ...... 00000001 1 <-- correct answer two's complement The problems of multiple representations of 0 and the need for the end-around carry are circumvented by a system called two's complement. In two's complement, negative numbers are represented by the bit pattern which is one greater (in an unsigned sense) than the ones' complement of the positive value. In two's-complement, there is only one zero (00000000). Negating a number (whether negative or positive) is done by inverting all the bits and then adding 1 to that result. Addition of a pair of two's-complement integers is the same as addition of a pair of unsigned numbers (except for detection of overflow, if that is done). For instance, a two's-complement addition of 127 and −128 gives the

Page 14 of 23 Comp.211{R} Computer Application

same binary bit pattern as an unsigned addition of 127 and 128, as can be seen from the table. An easier method to get the two's complement of a number is as follows: Example 1 Example 2 1. Starting from the right, find the first '1' 0101001 0101100 2. Invert all of the bits to the left of that one 1010111 1010100 EXCESS-N Excess-N, also called biased representation, uses a pre-specified number N as a biasing value. A value is represented by the unsigned number which is N greater than the intended value. Thus 0 is represented by N, and −N is represented by the all-zeros bit pattern. This is a representation that is now primarily used within floating-point numbers. The IEEE floating-point standard defines the exponent field of a single-precision (32-bit) number as an 8-bit Excess-127 field. The double-precision (64-bit) exponent field is an 11-bit Excess-1023 field. An easy way to convert a signed integer to Excess-127 is to add 127 to the desired number. E.g.: -37 + 127 = 90 so, -37 would be written the same way as 90. 01011010 { 90 will be written as 90+127 = 217} MODERN COMPUTERS TYPICALLY USE THE TWO'S- COMPLEMENT FOR INTEGER REPRESENTATION The two's complement of a binary number is defined as the value obtained by subtracting the number from a large power of two (specifically, from 2N for an N-bit two's complement). A two's-complement system or two's-complement arithmetic is a system in which negative numbers are represented by the two's complement of the absolute value; 1. this system is the most common method of representing signed integers on computers. 2. In such a system, a number is negated (converted from positive to negative or vice versa) by computing its two's complement. An N-bit two's-complement numeral system can represent every integer in the range −2N-1 to +2N-1-1. The two's-complement system has the advantage of not requiring that the addition and subtraction circuitry examine the signs of the operands to determine whether to add or subtract. This property makes the system both simpler to implement and capable of easily handling higher precision arithmetic. Also, zero has only a single representation, removing the problems associated with negative zero, which exists in ones'-complement systems.

Calculating two's complement In two's complement notation, a positive number is represented by its ordinary binary representation, using enough bits that the high bit, the sign bit, is 0. The two's complement operation is the negation operation, so negative numbers are represented by the two's complement of the representation of the absolute value. In finding the two's complement of a binary number, the bits are inverted, or "flipped", by using the bitwise NOT operation; the value of 1 is then added to the resulting value. Bit overflow is ignored, which is the normal case with zero. For example, beginning with the signed 8-bit binary representation of the decimal value 5, using subscripts to indicate the base of a representation needed to interpret its value:

000001012 = 510 The most significant bit is 0, so the pattern represents a non-negative (positive) value. To convert to −5 in two's-complement notation, the bits are inverted; 0 becomes 1, and 1 becomes 0:

Page 15 of 23 Comp.211{R} Computer Application

11111010 At this point, the numeral is the ones' complement of the decimal value 5. To obtain the two's complement, 1 is added to the result, giving:

111110112 = − 510 The result is a signed binary number representing the decimal value −5 in two's-complement form. The most significant bit is 1, so the value represented is negative. The two's complement of a negative number is the corresponding positive value. For example, inverting the bits of −5 (above) gives: 00000100 And adding one gives the final value:

000001012 = 510 The value of a two's-complement binary number can be calculated by adding up the power-of- two weights of the "one" bits, but with a negative weight for the most significant (sign) bit; for example: 7 6 111110112 = − 128 + 64 + 32 + 16 + 8 + 0 + 2 + 1 = ( − 2 + 2 + ...) = − 5 Note that the two's complement of zero is zero: inverting gives all ones, and adding one changes the ones back to zeros (the overflow is ignored). Also the two's complement of the most negative number representable (e.g. a one as the sign bit and all other bits zero) is itself. Hence, there appears to be an 'extra' negative number. A more formal definition of a two's-complement negative number (denoted by N* in this example) is derived from the equation N* = 2n − N, where N is the corresponding positive number and n is the number of bits in the representation. For example, to find the 4 bit representation of -5:

N = 510 therefore N = 01012 n = 4 Hence: * n 4 N = 2 − N = 2 − 510 = 100002 − 01012 = 10112 The calculation can be done entirely in base 10, converting to base 2 at the end: * n 4 N = 2 − N = 2 − 5 = 1110 = 10112

Alternative conversion process A shortcut to manually convert a binary number into its two's complement is to start at the least significant bit (LSB), and copy all the zeros (working from LSB toward the most significant bit) until the first 1 is reached; then copy that 1, and flip all the remaining bits. This shortcut allows a person to convert a number to its two's complement without first forming its ones' complement. For example: the two's complement of "0011 1100" is "1100 0100", where the underlined digits are unchanged by the copying operation. In computer circuitry, this method is no faster than the "complement and add one" method; both methods require working sequentially from right to left, propagating logic changes. The method of complementing and adding one can be sped up by a standard carry look-ahead adder circuit; the alternative method can be sped up by a similar logic transformation.

Page 16 of 23 Comp.211{R} Computer Application

INFORMATION STORAGE CODES

ASCII Like other character representation computer codes, ASCII specifies a correspondence between digital bit patterns and the symbols/glyphs of a written language, thus allowing digital devices to communicate with each other and to process, store, and communicate character-oriented information. The ASCII [1] — or a compatible extension (see below) — is used on nearly all common computers, especially personal computers and workstations. The preferred MIME name for this encoding is "US-ASCII".[2] ASCII is, strictly, a seven-bit code, meaning that it uses the bit patterns representable with seven binary digits (a range of 0 to 127 decimal) to represent character information. At the time ASCII was introduced, many computers dealt with eight-bit groups (bytes or, more specifically, octets) as the smallest unit of information; the eighth bit was commonly used as a parity bit for error checking on communication lines or other device-specific functions. Machines which did not use parity typically set the eighth bit to zero,[3] though some systems such as Prime machines running PRIMOS set the eighth bit of ASCII characters to one. ASCII only defines a relationship between specific characters and bit sequences; aside from reserving a few control codes for line-oriented formatting, it does not define any mechanism for describing the structure or appearance of text within a document. Such concepts are within the realm of other systems such as the markup languages. History ASCII developed from telegraphic codes and first entered commercial use as a seven-bit code promoted by Bell data services in 1963. The Bell System had previously planned to use a six-bit code, derived from Fieldata, that added punctuation and lower-case letters to the earlier five-bit Baudot teleprinter code, but was persuaded instead to join the ASA subcommittee that had started to develop ASCII. Baudot helped in the automation of sending and receiving telegraphic messages, and took many features from ; however, unlike Morse code, Baudot used constant-length codes. Compared to earlier telegraph codes, the proposed Bell code and ASCII both underwent re-ordering for more convenient sorting (especially alphabetization) of lists, and added features for devices other than . Bob Bemer introduced features such as the 'escape sequence'. His British colleague Hugh McGregor Ross helped to popularize this work, as Bemer said, "so much so that the code that was to become ASCII was first called the Bemer-Ross Code in Europe". The American Standards Association (ASA, later to become ANSI) first published ASCII as a standard in 1963. ASCII-1963 lacked the lowercase letters, and had an up-arrow (↑) instead of the caret (^) and a left-arrow (←) instead of the (_). The 1967 version added the lowercase letters, changed the names of a few control characters and moved the two controls ACK and ESC from the lowercase letters area into the control codes area. ASCII has also become embedded in its probable replacement, Unicode, as the 'lowest' 128 characters. In terms of mere adoption, ASCII is one of the most successful software standards ever. ASCII printable characters Code 32, the "space" character, denotes the space between words, as produced by the large space-bar of a keyboard. Codes 33 to 126, known as the printable characters, represent letters, digits, punctuation marks, and a few miscellaneous symbols. Binary Dec Hex Glyph (face) 0010 0000 32 20 (blank) 0010 0001 33 21 ! 0010 0010 34 22 " 0010 0011 35 23 # 0010 0100 36 24 $ 0010 0101 37 25 % 0010 0110 38 26 &

Page 17 of 23 Comp.211{R} Computer Application

0100 0001 65 41 A 0100 0010 66 42 B 0100 0011 67 43 C 0100 0100 68 44 D 0100 0101 69 45 E 0100 0110 70 46 0100 0111 71 47 G 0100 1000 72 48

Binary-coded decimal

In computing and electronic systems, Binary-coded decimal (BCD) is an encoding for decimal numbers in which each digit is represented by its own binary sequence. Its main virtue is that it allows easier conversion to decimal digits for printing or display. Its drawbacks are the complexity of circuits needed to implement mathematical functions and the space wasted in the encoding - 6 wasted patterns per digit. Even though the importance of BCD has diminished, it is still encountered. In BCD, a digit is usually represented by four bits which, in general, represent the values/digits/characters 0-9. Other combinations are sometimes used for sign or other indications. Technical details To BCD-encode a decimal number using the common encoding, each digit is encoded using the four-bit binary bit pattern for each digit. For example, the number 127 would be: 0001 0010 0111 Since most computers store data in eight-bit bytes, there are two common ways of storing four-bit BCD digits in those bytes: each digit is stored in one byte, and the other four bits are then set to all zeros, all ones (as in the EBCDIC code), or to 0011 (as in the ASCII code) two digits are stored in each byte. Unlike pure binary encodings large numbers can easily be displayed by splitting up the nibbles and sending each to a different character with the logic for each display being a simple mapping function. Converting from pure binary to decimal for display is much harder involving integer multiplication or divide operations. The BIOS in many PCs keeps the date and time in BCD format, probably for historical reasons (it avoided the need for binary to ASCII conversion).

The following table represents the 4-bit BCD equivalents of digits Decimal Digit BCD Equivalent 0 0000 1 0001 2 0010 3 0011 4 0100

Page 18 of 23 Comp.211{R} Computer Application

5 0101 6 0110 7 0111 8 1000 9 1001 Decimal Digit BCD Equivalent Not used 1010 Not used 1011 Not used 1100 Not used 1101 Not used 1110 Not used 1111

The 1010 (decimal 10) to 1111(decimal 15) are not used as 10 to 15 have a dependent representations in BCD from 1 0. So this forces a 4-bit BCD to represent only decimal numbers, but no characters. To solve this problem instead of 4 bit code, a 6 bit code is used in the following way: Cha BCD CODE Cha BCD CODE Cha BCD CODE Cha BCD CODE r Zon Digi r Zon Digi r Zon Digi r Zon Digi e t e t e t e t A 11 000 10 000 S 01 000 1 00 000 1 1 1 1 ...... I 11 100 R 10 100 01 100 0 00 101 1 1 1 0 OCTAL 61-71 OCTAL 41-51 OCTAL 22-31 OCTAL 01-12 The English Word ”BASE” is represented as B: 110010 A: 110001 S: 010010 E: 110101

EBCDIC

EBCDIC (Extended Binary Coded Decimal Interchange Code) is an 8-bit character encoding () used on IBM mainframe operating systems, like z/OS, OS/390, VM and VSE, as well as IBM minicomputer operating systems like OS/400 and i5/OS. It is also employed on various non-IBM platforms such as Fujitsu-Siemens' BS2000/OSD, HP MPE/iX, and Unisys MCP. It descended from punched cards and the corresponding six bit binary-coded decimal code that most of IBM's computer peripherals of the late 1950s and early 1960s used.

History

EBCDIC was devised in 1963 and 1964 by IBM and was announced with the release of the IBM System/360 line of mainframe computers. It was created to extend the Binary-Coded Decimal that existed at the time. EBCDIC was developed separately from ASCII. EBCDIC is an 8-bit encoding, versus the 7-bit encoding of ASCII.

Technical details

EBCDIC code pages and ASCII-based code pages are incompatible with each other. Since computers only understand numbers, these codepages assign a character to these numbers. The same byte values are

Page 19 of 23 Comp.211{R} Computer Application

interpreted as different characters depending on the codepage used. Data stored in EBCDIC require a code page conversion before the text can be viewed on ASCII based machines, like a personal computer.

A single EBCDIC byte occupies eight bits, which are divided in two halves or nibbles. The first four bits is called the zone and represent the category of the character, whereas the last four bits is called the digit and identify the specific character.

The first 64 code points (00-3F) are control characters. 33 of these codes have ASCII equivalents. One notable difference between the two sets is that ASCII has carriage return (CR) and linefeed (LF) codes, which are generally, used as end of line indicators within ASCII text files, whereas EBCDIC has an additional (NL) code. The other 31 control codes are used for various terminal and device controls, mostly specific to IBM hardware.

The additional two bits (from 6 bits of BCD) are used as ZONE bits, extending the zone bits to 4. In this configuration we are able to address 28 = 256 characters, instead of 64 in BCD. Further, an 8 bit code can easily be grouped in two 4-bit group, which is easily represented by a Hexadecimal code.

Char EBCDIC CODE Char EBCDIC CODE Char EBCDIC Char EBCDIC CODE Zone Digit Zone Digit CODE Zone Digit A 1100 0001 J 1101 0001 Zone Digit 0 1111 0001 ...... S 1110 0001 ...... I 1100 1001 R 1101 1001 ...... 9 1111 1010 HEX C1-C9 HEX D1-D9 Z 1110 1001 HEX F0-F9 HEX E1-E9

For a numeric value, sign indicator code is used, to represent the sign of the number. For this, C-plus sign D-minus & F-unsigned number. WHENEVER A NUMBER IS REPRESENTED IN EBCDIC, THE SIGN INDICATOR IS USED IN THE ZONE POSITION OF RIGHTMOST DIGIT. So the zone bit of rightmost bit is changed, all other digits always have F (see table above) This introduces a possibility of packing decimal numbers in EBCDIC. Instead of writing zone bit for all digits in a number, we can just write the sign indicator + digits (written without the F). examples are Numeric EBCDIC sign indicator packed format 345 F3F4F5 F for unsigned (no sign is mentioned) 345F +345 F3F4C5 C for positive 345C -345 F3F4D5 D for negative 345D

Character representations in EBCDIC are exactly straight mapping from the table above.

UNICODE Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems. Developed in conjunction with the Universal Character Set standard and published in book form as The Unicode Standard, the latest version of Unicode consists of a repertoire of more than 109,000 characters covering 93 scripts, a set of code charts for visual reference, an encoding methodology and set of standard character encodings, an enumeration of character properties such as upper and lower case, a set of reference data computer files, and a number of related items, such as character properties, rules for normalization, decomposition, collation, rendering, and bidirectional display order (for the correct display of text containing both right-to-left scripts, such as Arabic and Hebrew, and left-to-right scripts). As of 2011, the most recent major revision of Unicode is Unicode 6.0.

Page 20 of 23 Comp.211{R} Computer Application

The existing character encoding schemes (ASCII etc) are replaced with Unicode and its standard Unicode Transformation Format (UTF) schemes, as many of the existing schemes are limited in size and scope and are incompatible with multilingual environments. Unicode can be implemented by different character encodings. The most commonly used encodings are UTF-8 (which uses one byte for any ASCII characters, which have the same code values in both UTF-8 and ASCII encoding, and up to four bytes for other characters).

Unicode can be implemented by different character encodings. The most commonly used encodings are UTF-8 (which uses one byte for any ASCII characters, which have the same code values in both UTF-8 and ASCII encoding, and up to four bytes for other characters), the now- obsolete UCS-2 (which uses two bytes for each character but cannot encode every character in the current Unicode standard), and UTF-16 (which extends UCS-2 to handle code points beyond the scope of UCS-2).

Page 21 of 23 Comp.211{R} Computer Application

Programming Languages Language is a system of communication. Language is composed of symbols (character set) and there are rules to use the symbols (grammar).

Any notation for the description of algorithms and data structures may be termed as a programming language. In natural languages, one can use the language incorrectly but chances are it would be understood and interpreted correctly but in case of computer programming languages, the use of syntax must be correct.

Machine Language This language is directly understood by the computer. No translation is required for a program written in a machine language to work. it consists of strings of 0's and 1's. A statement written in machine language consists of two parts:

 OPCODE It is the Operation Code, the command which is to be executed.

 OPERAND It is the data which is sent as the parameter with opcode. The computer executes the opcode with/on the operand as data.

The major advantage of machine language is that it is the native language to the computers. Since no translation is required, the code will run exactly as it is written. The program size is small. The machine language code executes the fastest of all codes written in any programming languages.

Disadvantages of machine language are 1. It is highly dependent on the machine it is written for. This dependency is such that a program written for one machine will not work for machines with different configurations. 2. It is difficult to program, since the language is strings of 0's and 1's. 3. It is more error prone because the program is not readable. 4. It is difficult to modify.

Assembly language In this language, letter symbols were substituted for opcodes in machine language. The language which substitute letters and symbols for 0's & 1's in machine language is called the assembly language. Software called assembler is required to translate the code written in assembly language to machine language.

Advantages of assembly language are: 1. Easy to understand and use as compared to machine language. 2. Easy to locate and correct errors. 3. Easy to modify. 4. No need to know the addresses as symbolic references are used. 5. It maintains the efficiency of machine language since there is one to one correspondence between assembly language statements and machine language.

Page 22 of 23 Comp.211{R} Computer Application

Limitations of assembly language are: 1. It is machine dependent. 2. Knowledge of hardware is required to write programs. 3. The coding is still at machine level with primitive operations as in machine language.

High level languages These languages are not machine dependent and they use near English format (words) to express logic. They use familiar mathematical symbols like '+' for addition etc.

High level languages are processed by a compiler which translates the source program into low level language (either assembly language or directly machine language). A compiler checks for syntax (formal rules of grammar) and semantics (rules which gives meanings to the sentences) and produces an output.

One to many Source Program Object one Executable Program translation program translations

High level languages lack efficiency and flexibility as automated features are always generic.

Page 23 of 23