<<

Teach yourself the Fundamentals of

Data Representation

for AQA GCSE Computer Science (8520)

Students Workbook

By Nichola Lacey

Contents

Introduction ...... 3

How to use this book ...... 3

Who should use this book? ...... 3

What are number bases? ...... 4

Decimal (base 10) ...... 4

Binary (base 2) ...... 5

What can be represented using binary? ...... 7

Hexadecimal (base 16) ...... 8

Why is used? ...... 8

End of chapter recap ...... 11

Converting between number bases ...... 12

Convert from binary to ...... 12

Convert from decimal to binary ...... 15

Number base notation ...... 16

Convert from binary to hexadecimal ...... 17

Convert hexadecimal to binary ...... 18

Convert hexadecimal to decimal ...... 19

Convert from decimal to hexadecimal ...... 20

End of chapter recap ...... 21

Units of ...... 22

Bits ...... 22

Bytes ...... 22

Amount of storage space required ...... 23

End of chapter recap ...... 23

Binary Arithmetic ...... 24

Binary Shifts ...... 27

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science 1 By Nichola Lacey End of chapter recap ...... 28

Character encoding ...... 29

7- ASCII ...... 29

Unicode ...... 32

End of chapter recap ...... 33

Representing Images ...... 34

Pixels ...... 34

Colour Depth ...... 37

Calculating ...... 41

Converting binary into a bitmap ...... 42

End of chapter recap ...... 45

Representing Sound ...... 46

Creating a digital sound wave ...... 46

Calculate sound file sizes ...... 50

End of chapter recap ...... 51

Data Compression ...... 52

Huffman coding ...... 54

Creating your Huffman code ...... 55

Calculating the used with data compression ...... 61

Run Length Encoding (RLE) ...... 63

End of chapter recap ...... 65

Answers ...... 66

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 2 By Nichola Lacey Introduction

Computers store text, images and sound as binary and this book has been written to give you practical hands-on approach to help you learn how this is done and how data is compressed to save file size. Instead of chapters of technical jargon and mind-numbing tedium the theory is broken down into smaller, manageable chunks with practical tasks for you to perform as you go along. This helps you understand the theory and remember it as you apply it to practical problems.

How to use this book

It is recommended that you start at the beginning and work through the chapters in order as each chapter builds on the knowledge you have gained from the previous one.

You are not expected to be a passive passenger on this journey; if you want to know how data is represented in computer systems you will need to do a bit of work to achieve this. It is highly recommended that you do perform the tasks as instructed, even if some of them seem a little bizarre. They are all included for a reason and will help you learn the theory behind data representation. If you get stuck the answers, where there is a definite answer, to all the tasks are given at the back (page 66) of this book but try not to cheat and give the tasks a go.

Who should use this book?

This book was specifically written to assist students preparing for their AQA GCSE Computer Science examination (8520) and the objectives have been written specifically to match the syllabus, as of February 2018. However, the theory and methods would be beneficial to anybody who wants to know how data is represented in computer systems.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 3 By Nichola Lacey What are number bases?

Objective: Understand the following number bases: decimal (base 10), binary (base 2) and hexadecimal (base 16).

Decimal (base 10)

Since you first learnt to recognise numbers you have been taught to use a base 10 number system, this is A decimal number known as a decimal (or denary) number base. It has 10 system uses 10 digits (0 – different digits, 9) to represent the value. 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9

There is no single digit for the number ten and we use two digits (a 1 and a 0) to represent the place value we know as 10. This stands for "1 ten and 0 ones".

To represent any number above 9, we use different values each worth ten times more than the previous column (starting from the right). Take for example the number two thousand, nine hundred and thirty-five (2,935). This can be split into separate columns, each representing a different value.

Thousands Hundreds Tens Ones 2 9 3 5

Each of the columns are worth ten times more than the one to the right. Using the example above the 3 is representing the value 30, the 9 is representing the value 900 and the 2 is representing the value 2000. Adding those values together we get the following:

2000 900 30 + 5 2935

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 4 By Nichola Lacey Task 1: Split the following numbers into their correct columns and then write them in the final column as a decimal number.

Decimal Explanation Thousands Hundreds Tens Ones number

One thousand, four hundred and fifty-six

Nineteen ninety-nine

Six thousand and twenty

Binary (base 2)

Computers use electrical pulses and these can either be on A system or off. As there are only two states a binary number system is used by most computer systems. This on or off uses 2 digits (0 and 1) to state is represented by a 1 for on and 0 for off. represent the value. Whereas a base 10 number system uses 10 digits 0 – 9, a base 2 number system uses 2 digits 0 – 1. The base 10 number system has columns which were worth ten times the amount of the previous column and a base 2 number system has columns worth twice as much as the previous column.

Eight Four Two One 1 0 1 1

Using the example above, the first 1 represents 8, there is nothing in the four-place column, 1 in the two column and 1 in the one column. If we add those column values together we get 11.

8 2 + 1 11

Therefore, the binary number 1011 is the same value as 11 in decimal.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 5 By Nichola Lacey Task 2: Place the following numbers into their correct columns and then write them as a binary number by combining them together and finally work out the decimal total by adding together the column headings which contain a 1, for example in the first row the decimal

value will be 11 (8 + 2 + 1) to therefore the binary value 1011 is the same as 11 in decimal.

Binary Decimal Explanation Eight Four Two One number value

1 x 8, 0 x 4, 1 x 2 and 1 x 1

1 x 8, 1 x 4, 0 x 2 and 0 x 1

1 x 2 and 1 x 1

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 6 By Nichola Lacey What can be represented using binary?

Objective: Understand that computers use binary to represent all data and instructions.

Computers use binary to store data. A binary digit, more commonly known as a bit, is the smallest piece of data possible in a computer system, it is either a 1 or a 0.

The circuits in a computer's processor are made up of billions of transistors. A transistor is a microscopic device that open and close circuits to allow electrical signals to either flow or not flow through the circuit. The digits 1 and 0 used in binary reflect the on and off states of a transistor.

To give you an idea of size of these transistors, a human hair is approximately 80,000- 100,000 nanometres wide. In 2016, researchers at Berkeley Lab created a working transistor which is just 1 nanometre long.

Computer programs, as you know them, are known as a “high-level language” and although may be full of technical jargon are more human-friendly than the language used by computers. The high- level languages must be converted into machine code which are binary commands to tell the computer what to do, where to store the data and where they are in the program. Programmers write in a high-level language code and this is converted by a translator into binary instructions that the processor can execute.

All software, programs, images, music, documents, video and any other information that is processed by a computer is stored using these on/off electrical pulses in the computer and can be represented in binary. We will be looking at how this is done later in the book.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 7 By Nichola Lacey Hexadecimal (base 16)

We have learnt that a base 10 number system uses 10 A hexadecimal number digits and a base 2 number system uses 2 digits, so you will not be surprised that a base 16 number system uses system uses 16 digits (0 - 16 digits. “But wait,” I hear you cry. “There are only 10 9 and A-F) to represent possible number digits, what do we use for the other 6?” the value. Good question and I’m glad you are paying attention. They could have chosen anything to be the extra symbols for the additional digits, but to make things easier they decided to use letters of the alphabet as most keyboards have these already. In fact, specifically, they use the first 6 uppercase characters of the alphabet so the digits available for a hexadecimal system are:

0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E and F.

The individual digits in a hexadecimal number are worth 16 times the amount of the previous column value.

4096 256 16 1 6 B 2 8

Therefore, the hexadecimal value 6B28 is worth 27,432 in a decimal number system. Luckily for a GCSE in Computer Science, you only need to be able to convert a number up to the decimal value of 255 which is two hexadecimal positions, which makes calculations much easier to work with.

Why is hexadecimal used?

Objective: Explain why hexadecimal is often used in computer science.

The reason people use hexadecimal numbers is as a shorthand notation of the long binary numbers you may have to work with. It shortens them considerably and therefore you are less likely to include typing errors or note down the value incorrectly if there are fewer digits to be working with.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 8 By Nichola Lacey For instance, symbols in word processors often show their hexadecimal number and you can see this in some applications.

They are also found when referring to colours. Colours are usually made up from three numbers known as the RGB code (which stands for red, green, blue) and these are displayed as hexadecimal numbers.

As we have seen, binary is used by computers and one hexadecimal digit can be used to represent 4 binary digits so makes converting between binary and hexadecimal neat and easier than using, say, a base 20 number system.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 9 By Nichola Lacey Here is a table showing the decimal values from 0 to 15 in both binary and hexadecimal.

Decimal Binary Hexadecimal 0 0000 0 1 0001 1 2 0010 2 3 0011 3 4 0100 4 5 0101 5 6 0110 6 7 0111 7 8 1000 8 9 1001 9 10 1010 A 11 1011 B 12 1100 C 13 1101 D 14 1110 E 15 1111 F

Here is the same number represented in the three number bases:

Decimal 2,890 Binary 1011 0100 1010 Hexadecimal B4A

In the next chapter, you will be learning how to convert between these different number bases for yourself but for now you only need to understand the differences between the three number bases.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 10 By Nichola Lacey End of chapter recap

Task 3: Using the table on the previous page, convert these binary numbers into hexadecimal to find the hidden words.

Binary number Hidden Word

1011 1110 1110 1111

1011 1110 1101

1101 1110 1010 1111

1010 1101 1101

1010 1100 1110

1011 1110 1010 1101

1111 1110 1110 1101

1011 1010 1101

1100 1010 1011

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 11 By Nichola Lacey Converting between number bases

Objective: Be able to convert in both directions between binary and decimal, binary and hexedecimal and decimal and hexedecimal.

Convert from binary to decimal

The easiest to convert is from binary to decimal. We have already seen that each column is worth double the previous column, you just need to remember three things:

• The columns start from the right (the least significant place) • The first column is worth 1 • Each column (moving to the left) is worth double the previous column. Here is an example of the column place value:

128 64 32 16 8 4 2 1

As you can see it starts from the right with the value 1 and they double in value as they move to the left.

To work out the decimal value for a binary number you simply insert each individual digit into the columns starting from the right. For instance the binary number 1011 would appear in the columns as follows:

128 64 32 16 8 4 2 1 1 0 1 1

Now you need to add up the column value for any columns that contain a 1.

8 2 + 1 11

Therefore, the binary value of 1011 is equivalent to 11 in decimal.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 12 By Nichola Lacey Task 4: Convert these binary numbers into decimal. First, insert the digits into the correct columns, starting from the right, and then add up the columns containing a 1 to find the decimal equivalent.

Binary number 128 64 32 16 8 4 2 1 Decimal value

110

11001

10101

100111

1011000

1100001

10101100

11111111

More often you will not have a neat table in which to lay out your binary numbers and people tend to simply note down the numbers above each of the binary digits to help them work out the column values.

In the example shown the decimal value is 90 (64 + 16 + 8 + 2).

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 13 By Nichola Lacey Task 5: Convert these binary numbers into decimal.

Binary number Decimal value

101

1101

10111

110010

1111100

1001001

10101110

11011001

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 14 By Nichola Lacey Convert from decimal to binary

To convert from decimal to binary you need to perform a little more maths.

Lets familiarise ourselves with the column headings again.

128 64 32 16 8 4 2 1

Step 1: Decide on the column to start with. This should be lower than or equal to the value you are looking for so if we wanted to convert 50 to binary we would start with the column 32. Enter a 1 in that column.

Step 2: Find out the remainder (50 – 32 = 18)

Step 3: Repeat steps 1 and 2 until there is no more remainder (in this case we would also put a 1 in the 16 and the 2 columns.

Step 4: Fill in the other columns with 0’s. Please note: you do not need to add 0s before your first 1 as these are unnecessary. Using the example of 50 our binary number would be 110010 (32 + 16 + 2).

Task 6: Convert decimal values into binary numbers.

Decimal value 128 64 32 16 8 4 2 1 Binary number

27

33

52

63

85

207

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 15 By Nichola Lacey

Number base notation

As you may have worked out, the number “10” could mean ten in decimal or it could be two in binary. To make it clear as to which number base is being used it is common to include the number base as a subscript after the number. For instance, 1002 would be a binary number (in this case it is equivalent to four in decimal) as opposed to 10010 which is one hundred in decimal.

Task 7: Convert the following numbers. If the number is currently a binary number convert it to a decimal number and if it is currently a decimal number, convert it to a binary number.

Original number Converted number

11102

1010

10110

110011102

11010

110000112

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 16 By Nichola Lacey Convert from binary to hexadecimal

Each hexadecimal number is equivalent to one of 16 possible digits. (0 – 9 and A – F). Neatly, 4 binary digits (bits) can also be used to represent the values 0 to 15, which is 16 possible digits.

Each hexadecimal number, therefore, can be represented by 4 bits. Look back on task 3 (page 11) where you created words out of binary blocks. Each of those binary blocks contains 4 bits.

Step 1: To convert a binary number into hexadecimal you need to split the binary number into blocks of 4 but this needs to start from the right-hand number (the least significant bit). Therefore, 1101101 would become 110 1101.

Step 2: Convert each of these blocks of 4 into a decimal number. If the number is over 9 then use the following 10 = A, 11 = B, 12 = C, 13 = D, 14 = E and 15 = F. Using our example of 110 1101 the first would make 6D.

Task 8: Convert the following binary numbers into hexadecimal

Binary Hexadecimal

1011 00112

1101 01112

11 01102

111011112

11101112

111110002

101112

101110102

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 17 By Nichola Lacey Convert hexadecimal to binary

To convert back from hexadecimal to binary you simply need to reverse the process.

Step 1: Convert each single into a decimal number (either the number is the number shown or if it is a letter use A = 10, B = 11, C = 12, D = 13, E = 14 and F = 15.

Step 2: Convert each of those decimal numbers into binary using the technique you learnt on page 15.

Step 3: Combine those number together into a single binary number (remove the spaces)

Task 9: Convert the following hexadecimal numbers into binary.

Hexadecimal Binary

8216

C316

F016

2B16

4A16

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 18 By Nichola Lacey Convert hexadecimal to decimal

There are several ways to do this, all of which involve complicated maths involving your 16 times table. Not many of us know our 16 times table (unless you are that delightful woman on Countdown of course) so the best advice I can give is to follow these much simpler steps, which, although may be more long winded, are less likely to make your brain hurt and make a mistake in your calculations:

Step 1: Convert the hexadecimal number to binary (see page 18)

Step 2: Convert the binary number to decimal (see page 12) Make sure you are using the whole binary number and working out the column headings as shown below, rather than working with the blocks of 4 individually.

Task 9: Convert the following hexadecimal numbers into decimal by first converting it to binary and then converting the whole binary number to decimal.

Hexadecimal Binary Decimal

1A16

B816

F916

4E16

BC16

5216

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 19 By Nichola Lacey Convert from decimal to hexadecimal

To reverse the process and convert from decimal to hexadecimal you need to do the following:

Step 1: Convert the decimal number to binary (see page 15)

Step 2: Convert from binary to hexadecimal (see page 17).

Task 10: Convert the following decimal numbers into hexadecimal by first converting it to binary and then converting the binary number to hexadecimal.

Decimal Binary Hexadecimal

1210

4910

7810

9510

11610

25510

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 20 By Nichola Lacey End of chapter recap

Task 11: Fill in the blanks on this conversion table.

Decimal Binary Hexadecimal

510

100010012

5B16

101011012

2710

E716

111000112

16310

110100012

6610

7C16

10110002

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 21 By Nichola Lacey Units of information

Objective: Know that a bit is the fundamental unit of information and a is a group of 8 bits. Know that the quantities of can be described using prefixes (kB for , etc.).

Bits A bit is the smallest piece of This is either a 0 or a 1. Bit is an abbreviation of the term Binary Digit (B-it) and is noted with a lowercase data stored on a computer b. Bytes A byte is a A byte is the name given to a group of 8 bit’s (from 00000000 to group of 8 bits 11111111). A byte can represent 256 different whole numbers and is noted with an uppercase B.

The table below shows the different units of information you should be aware of:

Name Abbreviation Multiple of… Bit b - Byte B 8 bits kB 1000 bytes MB 1000 kilobytes GB 1000 megabytes Terabytes TB 1000 gigabytes Petabytes PB 1000 terabytes Exabyte EB 1000 petabytes Zettabyte ZB 1000 exabytes Yottabyte YB 1000 zettabytes

Please note: The terms petabyte, exabyte, zettabyte and yottabyte are not needed for the AQA GCSE Computer Science examination (8520) and have only been included for your information.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 22 By Nichola Lacey Amount of storage space required

Different types of data require different amounts of storage space. Some examples of this follow:

Data Approximate file size A typical line of text from a book 90 bytes One page of text from a standard novel 4 kB MP3 song 3.5 MB A Blu-ray movie 22 GB

End of chapter recap

Task 12: Complete the crossword

Across Down 2) 1000 bytes 1) 1000 kilobytes 3) A single 1 or 0 3) 8 bits 4) 1000 gigabytes 5) 1000 megabytes

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 23 By Nichola Lacey Binary Arithmetic

Objectives: To be able to add up to three binary numbers.

Before we look at adding with binary let’s have a quick recap on the first few digits with binary using the first three digits

Binary Decimal equivalent 000 0 001 1 010 2 011 3 100 4 101 5 110 6 111 7

Now let’s have a look at some very simple binary maths

Binary arithmetic Binary Answer Decimal equivalent 0 + 0 0 0 + 0 = 0 0 + 1 1 0 + 1 = 1 1 + 0 1 1 + 0 = 1 1 + 1 10 1 + 1 = 2 1 + 1 + 1 11 1 + 1 + 1 = 3 1 + 1 + 1 + 1 100 1 + 1 + 1 + 1 = 4

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 24 By Nichola Lacey When you are adding two binary numbers together, for instance 1010 and 111, lay them out as follows:

1 0 1 0 + 1 1 1

Description Example

Take the first column on the right and add together the individual 1’s. In this case 0 + 1 = 1

Next take the next column and add together the 1’s. In this case 1 + 1 = 10. Don’t forget we are adding in binary so the answer should also be in binary. Put the 0 in the same column and carry the 1. This is usually shown below the bottom line.

In the next column don’t forget to include the 1 that has been carried forward. In this example the calculation is 0 + 1 + 1 which is 10. Again, put the 0 in the column and carry the 1.

In the final column add together the digits, including the carried digit. In this case 1 + 1 = 10. As there are no other columns instead of putting the 1 that is being carried below the line, move it up to the main answer row.

The answer to the sum 1010 + 111 = 10001. You can check this is correct by converting it to decimal. 1010 (ten) + 111 (seven) = 10001 (seventeen).

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 25 By Nichola Lacey Task 13: Complete the following binary addition

1 1 0 0 1 0 0 1 1 0 1 1 1 0 + 0 1 1 0 + 1 1 1 1 + 1 1 0 1

1 1 0 0 0 1 1 1 0 0 1 1 0 0 0 1 0 1 0 1 1 0 + 0 1 1 0 1 + 1 1 1 0 0 + 0 1 1 1

11110000 + 10011 = 1101011 + 10111 1111 + 111111 + 1101001 = + 11111111 =

Use this space to convert the binary numbers into decimal to check if your answers are correct.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 26 By Nichola Lacey Binary Shifts

Objectives: Be able to apply a binary shift to a binary number. Describe situations where binary shifts can be used.

Lets take an 8-bit binary number, a byte:

0 0 1 0 1 0 0 0

We could shift those binary digits to the right 

0 0 0 1 0 1 0 0 or to the left 

0 1 0 1 0 0 0 0

Notice that as you move them left or right extra 0’s are added to fill in the gaps. All the digits that fall off the end are lost and get deleted. But why would you bother shifting binary bits anyway?

Lets have a look if we keep moving the digits to the right.

128 64 32 16 8 4 2 1 0 0 0 1 0 1 0 0 20 0 0 0 0 1 0 1 0 10 0 0 0 0 0 1 0 1 5 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 1 1

You will see that every time the number is shifted to the right it divides by 2. The only exception is the second to last row which shows 2 rather than 2.5 as it is only working with whole numbers so the 1 bit is deleted once it falls off the end of the byte.

If we shift the numbers to the left it performs a different calculation:

128 64 32 16 8 4 2 1 0 0 0 1 0 1 0 0 20 0 0 1 0 1 0 0 0 40 0 1 0 1 0 0 0 0 80 1 0 1 0 0 0 0 0 160

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 27 By Nichola Lacey When you shift to the left it doubles the value. Sifting the binary digits to the left This is because we are working with a base 2 number system and each column is worth doubles the value and sifting to double the previous column. the right halves the value End of chapter recap

Task 14: Perform a binary shift on the decimal number by converting the number in binary and writing that in the Original binary column. Perform the binary shift and wrote the new binary number in the new binary number column. Finally, convert the new binary number into decimal. Remember is numbers don’t fit in the 8-bit grid they get deleted. Original New decimal Original binary number Instruction New binary number decimal number number 24 Double it

34 Half it Multiply by 50 4 60 Divide by 4 Multiply by 38 8 76 Divide by 8 Multiply by 9 16 Divide by 168 16

As you can see the binary shift allows you to perform simple multiplication and division by the power of 2, (i.e. multiplying and dividing by 2, 4, 8 ,16, 32, 64 and 128). However, as it is used for 8 bits anything which shifts the numbers off the ends will mean you get a wrong answer. We have looked at something called a binary shift and as you have seen this leads to errors in some occasions when it goes outside of the 8 bits.

For the AQA GCSE Computer Science examination (8520) you will not need to understand negative numbers and do not needs to investigate how to resolve this problem, however you do need to be aware of the problem of shifting digits outside the 8-bits.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 28 By Nichola Lacey Character encoding

Objective: Understand what a character is and be able to describe the following character encoding methods; 7-bit ASCII, Unicode.

7-bit ASCII

Look at the following table, known as a character set:

Binary Dec Hex Glyph Binary Dec Hex Glyph Binary Dec Hex Glyph 010 0000 32 20 ? 100 0000 64 40 @ 110 0000 96 60 ` 010 0001 33 21 ! 100 0001 65 41 A 110 0001 97 61 a 010 0010 34 22 " 100 0010 66 42 B 110 0010 98 62 b 010 0011 35 23 # 100 0011 67 43 C 110 0011 99 63 c 010 0100 36 24 $ 100 0100 68 44 D 110 0100 100 64 d 010 0101 37 25 % 100 0101 69 45 E 110 0101 101 65 e 010 0110 38 26 & 100 0110 70 46 F 110 0110 102 66 f 010 0111 39 27 ' 100 0111 71 47 G 110 0111 103 67 g 010 1000 40 28 ( 100 1000 72 48 H 110 1000 104 68 h 010 1001 41 29 ) 100 1001 73 49 I 110 1001 105 69 i 010 1010 42 2A * 100 1010 74 4A J 110 1010 106 6A j 010 1011 43 2B + 100 1011 75 4B K 110 1011 107 6B k 010 1100 44 2C , 100 1100 76 4C L 110 1100 108 6C l 010 1101 45 2D - 100 1101 77 4D M 110 1101 109 6D m 010 1110 46 2E . 100 1110 78 4E N 110 1110 110 6E n 010 1111 47 2F / 100 1111 79 4F O 110 1111 111 6F o 011 0000 48 30 0 101 0000 80 50 P 111 0000 112 70 p 011 0001 49 31 1 101 0001 81 51 Q 111 0001 113 71 q 011 0010 50 32 2 101 0010 82 52 R 111 0010 114 72 r 011 0011 51 33 3 101 0011 83 53 S 111 0011 115 73 s 011 0100 52 34 4 101 0100 84 54 T 111 0100 116 74 t 011 0101 53 35 5 101 0101 85 55 U 111 0101 117 75 u 011 0110 54 36 6 101 0110 86 56 V 111 0110 118 76 v 011 0111 55 37 7 101 0111 87 57 W 111 0111 119 77 w 011 1000 56 38 8 101 1000 88 58 X 111 1000 120 78 x 011 1001 57 39 9 101 1001 89 59 Y 111 1001 121 79 y 011 1010 58 3A : 101 1010 90 5A Z 111 1010 122 7A z 011 1011 59 3B ; 101 1011 91 5B [ 111 1011 123 7B { 011 1100 60 3C < 101 1100 92 5C \ 111 1100 124 7C | 011 1101 61 3D = 101 1101 93 5D ] 111 1101 125 7D } 011 1110 62 3E > 101 1110 94 5E ^ 111 1110 126 7E ~ 011 1111 63 3F ? 101 1111 95 5F _

This is known as the 7-bit ASCII table as each character is represented by 7 binary bits. They are displayed in sequences for instance you can see, a = 97, b = 98, c = 99. This means that if we are told what value a character is we can easily work out the value of subsequent or prior characters.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 29 By Nichola Lacey There are some additional characters known as invisible characters:

Binary Dec Hex Abbr. Description 000 0000 0 00 [NUL] Null character 000 0001 1 01 [SOH] Start of Header 000 0010 2 02 [STX] Start of Text 000 0011 3 03 [ETX] End of Text 000 0100 4 04 [EOT] End of Transcript 000 0101 5 05 [ENQ] Enquiry 000 0110 6 06 [ACK] Acknowledgement 000 0111 7 07 [BEL] Bell 000 1000 8 08 [BS] Backspace 000 1001 9 09 [HT] Horizontal Tab 000 1010 10 0A [LF] Line feed 000 1011 11 0B [VT] Vertical Tab 000 1100 12 0C [FF] Form feed 000 1101 13 0D [CR] Carriage return 000 1110 14 0E [SO] Shift Out 000 1111 15 0F [SI] Shift In 001 0000 16 10 [DLE] Data link escape 001 0001 17 11 [DC1] Device control 1 001 0010 18 12 [DC2] Device control 2 001 0011 19 13 [DC3] Device control 3 001 0100 20 14 [DC4] Device control 4 001 0101 21 15 [NAK] Negative acknowledgement 001 0110 22 16 [SYN] Synchronous idle 001 0111 23 17 [ETB] End of trans. block 001 1000 24 18 [CAN] Cancel 001 1001 25 19 [EM] End of medium 001 1010 26 1A [SUB] Substitute 001 1011 27 1B [ESC] Escape 001 1100 28 1C [FS] File separator 001 1101 29 1D [GS] Group separator 001 1110 30 1E [RS] Record separator 001 1111 31 1F [US] Unit separator 111 1111 127 7F [DEL] Delete

In the following message

Hello world, Computers are fun!

You may assume there are only 29 characters (including the spaces) but in fact there are a couple of invisible characters involved too.

[STX] Hello world,[CR] Computers are fun! [ETX]

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 30 By Nichola Lacey Task 15: Convert the following messages into binary. Make sure you include the invisible characters, for the first example the invisible characters have been included for you.

Original message Converted into binary

[STX]LOL[ETX]

1 2 3

Computers!

Objectives: Understand that character codes are commonly grouped and run in sequence within encoding Characters of a similar tables. type (upper case, lower As we have seen, the characters are in sequence, so you case or numeric digits) can work out the characters if you know the number of one run in sequence within of them in the sequence. Using the following code work out the decimal numbers for the following: a character set

A = 65

i = 105

Task 16: Work out the decimal values of the following characters WITHOUT looking at the table on page 29.

Character Decimal value Character Decimal value J e T m

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 31 By Nichola Lacey Unicode

Objectives: Describe the purpose of Unicode and the advantages of Unicode over ASCII. Know that Unicode uses the same codes as ASCII up to 127.

ASCII is fine if you live in a western country with a standard alphabet, but if you live in a country which uses a different alphabet then ASCII will not contain the characters you need. The latest version of Unicode contains 136,755 characters covering 139 modern and historic languages, as well as lots of symbols which are used in maths and other specialist areas. Here are just a few of the symbols and characters available through Unicode:

Character Unicode number Description ź 017A Latin small letter z with acute আ 0986 Bengali letter Aa Φ 03A6 Greek capital letter Phi ⻚ 2EDA CJK Radical C-simplified leaf 063A Arabic letter Ghain غ փ 0583 Armenian small letter Piwr ʨ 02A8 Latin small letter Tc digraph with curl → 2192 Rightwards arrow ∞ 221E Infinity

As you can see the Unicode identification numbers are displayed in hexadecimal. You can find out the Unicode number of symbols and characters in Word, if you select the Insert ribbon and then click on the drop-down arrow next to .

The first section of the Unicode table is exactly the same as the ASCII table, they use the same characters and numbers.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 32 By Nichola Lacey End of chapter recap

Task 17: Answer these questions to recap what you have learnt in this chapter.

1. What is the difference between ASCII and Unicode? 2. What advantage does Unicode have over ASCII?

3. If you are told that a capital letter A in Unicode is identified by the number 004116, what would the capital letter B be identified as?

4. If you are told that ⑩ in Unicode is 246916, what would 246316 give you?

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 33 By Nichola Lacey Representing Images

Objective: Understand what a pixel is and be able to describe how many pixels relate to an image and the way images are displayed.

Pixels

Pixel stands for “Picture Element” and refers to the way bitmap images are made up of lots of tiny squares of colour which, when they are small, blend together to make one large image.

In the example above, you can see when a small part of the image is expanded you can see the individual pixels in greater detail. In this instance the image was expanded by 400% so you can see the pixels.

You may have noticed this happening if A pixelated image is one where the you copy an image from the internet and individual pixels are clearly visible then expand it. This is known as the image becoming “pixelated”.

The pixels are displayed in neat rows and columns and a computers screen will have millions of these pixels to make up the display.

Even on a toolbar which you may assume only has 1 or two colours you notice how it can become pixelated when you zoom into it and you will see the variety of shades that are used to make up the image.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 34 By Nichola Lacey Objective: Describe the following for bitmaps: size of pixels, colour depth. Know that the size of a bitmap image in pixels (width x height) is known as the image resolution.

As we have seen, when we zoomed into the images, the larger the pixels are, the grainier the image becomes, also the overall size of the image will become larger.

Imagine this square is a very simple image.

If every pixel was ½ a centimetre wide and ½ a centimetre tall the overall image would be 1 ½ centimetres wide and 1 ½ centimetres tall and the image would have 9 pixels in it (3 x 3)

Now look at this image:

It is the same image but in this example each pixel has doubled in size and is therefore 1 centimetre wide by 1 centimetre tall making the image 3 centimetres wide and 3 centimetres tall but is still only has 9 pixels in it, it is only the pixels that have changed size.

The pixels on a computer monitor are very small and the smaller the pixel size the better quality the image will be. The size of the pixels is measure in DPI (dots per inch) and this is how many pixels can fit in a square inch.

Task 18: WITHOUT looking it up on the internet what do you think the average resolution is for the following:

Item Resolution (written a DPI) The average computer monitor iPhone 8 screen An image in a glossy magazine

When people talk about the image size they are not The image size is described meaning the width and height of the image but rather the in pixels (width x height) number of pixels that make the width and height of the image. This is because, as we have seen, the image can be re-sized but the number of pixels will stay the same.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 35 By Nichola Lacey Let’s have another look at the flower image.

This image is 252 x 181 pixels (it is always width shown first and then height). If we make the image larger or smaller, it still has the same number of pixels that make up the image.

This image still has 252 x 181 pixels.

This image still has 252 x 181 pixels.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 36 By Nichola Lacey Even though the image size may take up The higher the DPI the better quality more or less space on the page, the number of pixels stays the same. The only thing that the image will be. This is known as has changed is the number of dots per inch the image resolution. (DPI).

Colour Depth

Each pixel can only be a single colour, this means that you cannot have two colours in the same pixel. Let’s look at a very simple image.

This image has only two colours, black and white. If this was going to be stored in a bitmap it would need to be converted into pixels. Let’s keep it simple and use an 8 x 9 grid to do this.

You will notice that the smooth lines have disappeared, and the image is pixelated as it is a very low resolution, but that doesn’t matter because we only want to see how the image will be saved. Each of those pixels are assigned a colour, black or white, and we can use binary to represents them, a 0 for white and a 1 for black.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 37 By Nichola Lacey Now a computer would not be seeing the image it would only save the images, so lets remove the image to see the binary .

In fact, the computer would not even see the grid. It would see a series of numbers. 00011000001001000100001010100101100000011010010101011010001001000001 1000

It is only because we know the image size is 8 x 9 that we could split these up into their separate rows.

00011000 00100100 01000010 10100101 10000001 10100101 01011010 00100100 00011000 All this assumes we only use 1 bit for each pixels to show either a 1 or a 0. However if we want to use more colours we would need to use more bits for each pixel to allow us a larger number range.

For instance if we wanted to save this colourful image we would need two bits per pixel. 00 for white, 01 for black, 10 for yellow and 11 for blue.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 38 By Nichola Lacey This will double the number of pixels needed for the entire image to

00000001010000000000011010010000000110101010010001101110101110010110 10101010100101100110100110010001100101100100000001101001000000000001 01000000

The number of colours used in an image can greatly affect the quality.

Here is the same image saved with varying colour depth:

Image Colour depth Bits used per pixel

16, 777, 216 possible 24 colours per pixel

65,536 possible 16 colours per pixel

256 possible colours per 8 pixel

2 possible colours per 1 pixel

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 39 By Nichola Lacey Objective: Describe, using examples, how the number of pixels and colour depth can affect the size of a bitmap image.

Task 19: As you can see from the image below the colour depth greatly alters the file size. Write, in your own words, why you think this may be.

Task 20: As you can see from the image below the image size greatly alters the file size. Write, in your own words, why you think this may be.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 40 By Nichola Lacey Calculating file size

Objective: Calculate bitmap image file sizes based on the number of pixels and colour depth.

To work out the file size you will need to know the following:

W = image width

H= image height

D = colour depth (in bits)

As long as you know these things you can make an approximate calculation of the file size.

W x H x D = File size

Lets take our simple image from earlier:

The width is 8 pixels wide, the height is 9 pixels tall and a single bit is used for the colour depth.

8 x 9 x 1 = 72 bits

Let’s try another one…

The width is 252, the height is 181 and the colour depth is 24 bits per pixel.

252 x 181 x 24 = 10,944,688 bits (which is 136,836 bytes or 136 Kb)

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 41 By Nichola Lacey You should always attempt to shorten the numbering into the most appropriate units for instance Kb, Mb, Remember to divide the Gb etc.) number of bits by 8 first to If you look back on previous pages you will notice work out the bytes and then that the same image is actually 133 Kb and not 136 keep dividing by 1000 until Kb but this is because the measurements we are using are rough estimates and there are other you get to the shortest unit considerations to take into account but for the AQA notation (Kb, Mb, Gb etc.) GCSE in Computer Science these calculations are all you need to be aware of.

Task 21: Work out the approximate file size using the data you have been given for the following images.

Width (in pixels) Height (in pixels) Colour Depth Number of bits Shorthand 1,280 860 24 bits 2,500 2,000 8 bits 150 475 16 bits 15,000 20,000 64 bits

Converting binary into a bitmap

Objective: Convert into a black and white image. Convert a black and white image into binary data.

When a computer is presented with a series of binary digits it must know the dimentions of the image it is to convert and the colour depth so it knows how many bits are representing each pixel. Lets look at this incredibly simple image first.

010110101

We would not know how to construct it into the pattern we want unless we knew the following:

Image size: 3 x 3 Colour depth: 1 bit

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 42 By Nichola Lacey Now we know that, we can input this data into a grid:

0 1 0 1 1 0 1 0 1

Finally, we can convert that grid into the pattern once we know that 0 is white and 1 is black.

Task 22: Convert the following binary digits into a pattern.

Image size: 4 x 5 Colour depth: 1 bit Colour code: 0 = white and 1 = black

10110100110101010011

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 43 By Nichola Lacey To convert a pattern into a binary number you just need to read across the rows and write down the 0 or 1 depending on the colour of each pixel. When you get to the end of a row simply move to the start of the next row but write the binary digits directly after the last you wrote.

Let’s have a look at this pattern

Start with the first row and the binary sequence would be 010, move down to the next row and the binary sequence is 110, however instead of writing this on a separate row you continue with the same line of digits. Eventually you would get

010110011010

Task 23: Convert the pattern into binary digits where 0 is white and 1 is black.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 44 By Nichola Lacey End of chapter recap

Task 23: Answer these questions.

Question Your answer 1. What is a pixel?

2. How are pixels displayed in an image?

3. What is the difference between image size and file size?

4. What is meant by the term “image resolution”?

5. How can the number of pixels and the colour depth affect the file size of a bitmap image?

6. If you have an image 300 x 700 with a colour depth of 24 bits, what is the approximate file size? 7. Draw the following 4 x 5 1 bit image from the binary sequence where 0 is white

and 1 is black.

01101001100111111001

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 45 By Nichola Lacey Representing Sound

Objectives: Understand that sound is analogue and that it must be converted to a digital form for storage and processing in a computer.

Sound is produced by something vibrating which makes the air particles around it vibrate. These vibrations travel through the air and the delicate hairs inside our ears pick up on these subtle vibrations in the air and pass these messages to our brains that interpret them as sound. These vibrations are known as “analogue” however computers work using “digital” messages.

Here is an analogue sound wave that has been plotted on a graph.

Computers, as we know, store data digitally as binary. Everything in a computer needs to be broken down and stored in a binary format so to save an analogue sound wave on a computer it must be converted into a digital format.

Creating a digital sound wave

Objective: Understand that sound waves are sampled to create the digital version of the sound. Describe the digital representation of sound in terms of: sampling rate, sample resolution.

To store this as a digital sequence a sample must be taken of the sound way at set intervals. Using our wave from above we may say this is a sound wave over the period of 1 second so we can split that second down into smaller parts.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 46 By Nichola Lacey We are going to split it into tenths of a second, so we will be taking 10 samples over that one Sample rate = number of second period. This frequency would be known samples taken in a second, as 10 hertz (10Hz) as we are taking 10 samples measured in hertz (Hz) per second (1Hz = 1 sample per second).

Every hertz, we read the amplitude of the sound wave at that time and take note of that number. These numbers can be plotted on a graph as follows:

As you can see at the end of every tenth of a second a measurement was taken of where that line is, or the nearest number if it falls between two options. We can convert this into a 4-bit binary number for every tenth of a second.

Sample Decimal number Binary number 1 6 0110 2 10 1010 3 4 0100 4 7 0111 5 3 0011 6 9 1001 7 7 0111 8 11 1011 9 8 1000 10 5 0101

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 47 By Nichola Lacey If we combine these into one binary number we would get:

0110101001000111001110010111101110000101

Let’s look at the two graphs together.

As you can see, the digital signal has simplified the line. If we wanted to get a more accurate representation of the original digital signal we would need to do two things.

• Take more frequent samples (alter the sampling rate) Sample resolution = • Allow for more numbers to represent each amplitude (alter the sample resolution) the number of bits per sample

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 48 By Nichola Lacey

This is the same sound clip but this time it has been split into 20 samples a second (20 Hz) and each sample is taken from numbers ranging from 0 to 21 to allow greater accuracy of the soundwave to be recorded. This would increase the number of bits required to store the image.

Task 25: Fill in this table to record the decimal value of each sample and convert it to the binary equivalent using a 5-bit number.

Decimal Binary Decimal Binary Sample Sample number number number number 1 11 2 12 3 13 4 14 5 15 6 16 7 17 8 18 9 19 10 20

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 49 By Nichola Lacey Calculate sound file sizes

Objective: Calculate the sound file size based on the sampling rate and the sample resolution.

To work out the file size of a sound clip you will need to know the following:

rate = sampling rate

res = sample resolution

secs = number of seconds

As long as you know these things you can make an approximate calculation of the file size.

rate x res x secs = File size

Lets take the following example: A sound clip lasts 15 seconds, and a sample is taken every tenth of a second. Each of those samples can be between 0 and 255 so this will require 8 bits per sample.

10 x 8 x 15 = 1200 bits

Lets convert that into bytes: 1200 ÷ 8 = 150 bytes

In reality, sample rates and sample resolutions are much higher.

Task 26: WITHOUT looking it up, have a guess at the average sample rate and sample resolution would be to store these sounds digitally.

Description Sample rate Sample resolution HDTV Cinema Film MP3 music track

Task 27: Work out the approximate file sizes with the following data:

Sample Shortest Length of clip Sample rate Bits resolution notation 60 seconds 48,000 Hz 24-bit 2 minutes 44,100 Hz 16-bit 1 hour 20 mins 96,000 Hz 24-bit

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 50 By Nichola Lacey End of chapter recap

Task 28: Draw on the graph to show the digital samples that would be taken at the end of every tenth of a second. Fill in the table to show the samples that are taken and finally work out the smallest file size possible for this 1 second sound clip.

Decimal Binary Decimal Binary Sample Sample number number number number 1 6 2 7 3 8 4 9 5 10

What would be the smallest file size for this sound file?

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 51 By Nichola Lacey Data Compression

Objective: Understand why data may be compressed and that there are different ways to compress data.

We have looked at how binary is stored along with how images and sound files are interpreted and stored as binary. As we have seen these files can be very large, especially sound files and can take up a lot of memory. This can be slow for the computer to read and will take a long time to transmit across networks as the data needs to be broken down into “packets” of data to be sent to another computer and then reassembled at the other end so it can be understood again.

People are requiring more data to be sent across the networks and although technology is trying to keep up with demand most of us have experienced the buffering that happens when large files are slow to get through.

Compressing a file is when a file is encoded so it uses fewer bits than the original file format

When data is compressed it can be transmitted over a network more quickly than sending a large file in a decompressed state.

There are two ways of doing this.

• Lossless • Lossy Lossless data compression gets rid of unnecessary data to represent data without losing any information. This process is reversible.

For instance, in this very simple image of a flag the colours saved are as follows

white, white, red, red, white, white red, red, red, red, red, red, red, red, red, red, red, red, white, white, red, red, white, white

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 52 By Nichola Lacey This could be stored as:

2 x white, 2 x red, 2 x white 6 x red 6 x red 2 x white, 2 x red, 2 x white

Or even…

2 x white, 2 x red, 2 x white, 12 red, 2 x white, 2 x red, 2 x white

This code could easily then be reversed to reproduce the image again without any loss of the original data.

Lossy gets rid of the least essential data. For instance, when viewing an image, research has shown that the human eye can pick up differences in the brightness more easily then we can pick up the subtler difference in colours so some of the colour variants will be dropped reducing the colour depth of the image without a noticeable difference to the viewer. This is an irreversible process as once they have been lost, those colours cannot be brought back to the image.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 53 By Nichola Lacey Huffman coding

Objective: Explain how data can be compressed using Huffman coding. Be able to interpret/create Huffman trees.

Huffman coding provides a simple, unambiguous code by studying the frequencies that certain characters appear in a message. It is a lossless data compression method used with PKZIP files, JPEG image files and PNG image files.

When you first start to learn Huffman coding it can seem like a bit of a magic trick. You may feel confused until you get to the end when you realise what you have just witnessed and are pleasantly astounded by the cleverness of it. I am going to help you by showing the final result first. It is ruining the magic trick but it will help you know what you are aiming for when performing the Huffman code algorithm.

We are going to try to crack this code using the following table:

Binary number Character 101  100  01  00  11 

Working from left to right through the binary sequence below and using the table above, try to work out the correct sequence of characters. You may be surprised to find out there is only one way of cracking this code.

101010011100011100

The only way of splitting up the above binary number, when working from left to right, using the options from the table above are as follows:

101 01 00 11 100 01 11 00

This gives us the sequence…        

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 54 By Nichola Lacey You may notice that the three characters which crop up twice (,  and ) only use 2 binary bits and the others that crop up only once ( and ), use 3 binary bits to represent them. We have used a code that allows the most frequent characters to be represented using the least number of bits, but it still gives us a code that can only be cracked one way. Clever, isn’t it?

This is what Huffman coding allows us to do. However, instead of using the strange characters we have just been using, Huffman coding is more commonly used with ASCII or UNICODE symbols or pixel colours on an image etc. The most frequent binary sequences in any data file can be given a new shortened binary number which makes the whole file take up less bits and therefore is faster to transmit over a network.

You may be thinking, “But how can I possibly create a code that works that way?”. That is what Huffman coding does. It is an algorithm that allows you to create this binary code by following the steps as outlined on the next few pages.

Creating your Huffman code We are going to have a look at how to compress a simple text file. Let’s think about a short sentence.

“hello world”

Using the example above there are 8 different characters (including the space) but there are 11 characters in the sentence. Using ASCII we would normally use 7-bits to represent each character which would mean we would need to use 77 bits (7 bits per character x 11 characters) for the message. We are going to use Huffman coding which will manage to compress this message down into only 32 bits.

Not all the letters in the above sentence occur with the same Character Frequency frequency, some letters crop up a lot and others only occur once or h 1 twice. The idea is to lower the number of bits used to encode the e 1 data that occurs most frequently. If we count the number of times space 1 each letter occurs and sort them so the least frequent is at the top w 1 of the table and the most frequent is at the bottom of the table, we r 1 end up with a table like this: d 1

o 2

l 3

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 55 By Nichola Lacey Step 1: Take the top two items in the table able and draw nodes (round cornered rectangle as shown below) containing the frequency of the character and the character in round brackets. Join these together with a new node containing the total of the other two nodes. Add this value into your table in the correct sorted order and remove the top two values.

Your table should now look like this:

Character Frequency space 1 w 1 r 1

d 1 o 2 2 l 3

Step 2: Repeat step 1 until you only have 1 node left. Don’t forget to update your table by deleting the rows you have used and adding in the new node values in their correct sorted position in the table.

Your table should now look like this:

Character Frequency o 2 2 2 2 l 3

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 56 By Nichola Lacey In this next example we have added the “o” node which has a frequency of two and linked this to the “2” node we created earlier as these are the top two values in the table.

Your table should now look like this:

Character Frequency 2 2 l 3 4

As you can see below we are joining the two “2” nodes we created earlier together and don’t forget to update the table to show this

Your table should now look like this:

Character Frequency l 3 4 4

In the example below, we have added the “l” node above the “4” node as the “l” node appears before the 4 node in the table and these are the top two values in the table.

Your table should now look like this:

Character Frequency 4 7

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 57 By Nichola Lacey The last thing we need to do is add up the last two nodes and then we can stop repeating steps 1 and 2. This is our final Huffman code tree.

Step 3: For each “branch” of the tree we need to add a 1 or a 0. For the top branch of each pair add a 1 and for the lower branch of each pair add a 0.

Step 4: We are going to use these to create a unique code for each character in our original table. For instance, to find the “h” character we start at the 11 node on the left of the diagram and would take the top route (1) to get to 7, it would then take the lower route (0) to get to the 4, it would take the lower route again (0) to get to the 2 and finally the top route to get to the “h” (1). This would give us the pathway 1001 which would be the unique bitmap pattern for the “h” character in this Huffman Code tree.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 58 By Nichola Lacey Starting with the left-hand side of the Huffman tree, follow the branches to work out the unique binary number for each of the original characters and store them in a table.

Character Binary code

l 11

o 101

h 1001

e 1000

space 011

w 010

r 001

d 000

Step 5: Using this new binary table you can create a binary sequence.

10011000111110101101010100111000

This only has 32 bits rather than the 77 the message would have taken if each character was saved with 7-bits using ASCII. It may look like this could be misinterpreted but if you move through the binary sequence and check against the table with their new binary codes there is only one way this can be split into the correct characters

Take the beginning of the sequence. There are no characters in the table that match 1, 10, 100 or 10011 etc. There is only one letter that could possibly be at the beginning and that is 1001. Once the correct letter has been found and decoded the computer will look at the next collection of bits in the sequence and decode them.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 59 By Nichola Lacey Task 29: Complete the table for the following Huffman tree

Character Frequency Bitmap Path Number of bits n 1 101 3 c 3 100 j 3 t r

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 60 By Nichola Lacey Calculating the bits used with data compression

Objective: Be able to calculate the number of bits required to store a piece of data compressed using Huffman coding. Be able to calculate the number of bits required to store a piece of uncompressed data in ASCII.

To calculate the number of bits used in a piece of data compressed using Huffman coding we need to add a couple of columns to our table we have produced so it now looks as follows.

Character Binary code Frequency Number of bits

l 11 3 2

o 101 2 3

h 1001 1 4

e 1000 1 4

space 011 1 3

w 010 1 3

r 001 1 3

d 000 1 3

To work out the total number of bits multiply the frequency by the number of bits for each character and then add them all together.

(3 x 2) + (2 x 3) + (1 x 4) + (1 x 4) + (1 x 3) + (1 x 3) + (1 x 3) + (1 x 3) = 32 bits or 4 bytes.

To work out the number of bits required for ASCII, remember ASCII is stored using 7-bits so you simply need to multiple the number of characters (including spaces) by 7.

11 x 7 = 77 bits or approximately 10 bytes.

By performing the Huffman coding on this small piece of data we have managed to save 6 bytes of data in just 11 characters. This is quite impressive.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 61 By Nichola Lacey Task 30: Here an image has been expanded and pixelated and the colour of each pixel has been named to make it easier for you. Draw a Huffman tree for the image (you may want to scribble a table on a separate piece of paper to help you keep track of your table). Finally work out the file size as if it was originally saved with a

16-bit colour depth and then find out the file size once Huffman coding has been used to compress it.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 62 By Nichola Lacey Run Length Encoding (RLE)

Objective: Explain how data can be compressed using run length encoding (RLE). Represent data in RLE frequency/data pairs.

Run length encoding (RLE) is another form of file compression that is used.

Lets look at a very simple 1- bit colour depth bitmap image.

The first row would be stored as 001100, the next row as 011110 and so on. If we combine all those rows into one long string, we would get:

001100011110110011110011011110001100 Instead of repeating so many bits we could include the frequency first and then the character. To help us we will first split the bit sequence so that each time we change from a 1 to a 0 and vice versa we add a break.

00 11 000 1111 0 11 00 1111 00 11 0 1111 000 11 00

Now we have split it into the different bit blocks we can state how many character are in each block

2x0 2x1 3x0 4x1 1x0 2x1 2x0 4x1 2x0 2x1 1x0 4x1 3x0 1x1 1x0

We can combine these all together to make a new code by removing the “x” symbol and replacing it with a space.

2 0 2 1 3 0 4 1 1 0 2 1 2 0 4 1 2 0 2 1 1 0 4 1 3 0 1 1 1 0

From this it would then be possible to reproduce the original bit sequence and then create the image correctly, assuming you know the width and depth of the image you need to create.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 63 By Nichola Lacey Task 31: Use RLE to show the frequency and value for the following bitmap sequences in each row.

Image sequence Run Length encoding 1 1 1 0 0 1 1 1 1 1 1 1 0 1 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 1 1

Task 32: Uncompress the RLE to shade in the correct boxes to recreate the bitmap image for each row

Run Length encoding Image sequence 3 1 5 0 1 1 3 0 3 1 1 0 2 0 5 1 1 0 3 0 2 1 2 0 1 1

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 64 By Nichola Lacey End of chapter recap

Task 33: Answer the following questions in your own words

Question Your Answer 1. Why are files compressed?

2. Explain the difference between lossless and lossy data compression.

3. Explain how a text file can be compressed using Huffman coding

4. Explain how data can be compressed using run length encoding (RLE).

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 65 By Nichola Lacey Answers

Task 1 Decimal Explanation Thousands Hundreds Tens Ones number One thousand, four 1 4 5 6 1456 hundred and fifty-six Nineteen ninety-nine 1 9 9 9 1999 Six thousand and 6 0 2 0 6020 twenty Task 2 Binary Decimal Explanation Eight Four Two One number value 1 x 8, 0 x 4, 1 x 2 and 1 x 1 0 1 1 1011 11 1 1 x 8, 1 x 4, 0 x 2 and 0 x 1 1 0 0 1100 12 1 1 x 2 and 1 x 1 0 0 1 1 0011 3

Task 3 Binary number Hidden Word 1011 1110 1110 1111 BEEF 1011 1110 1101 BED 1101 1110 1010 1111 DEAF 1010 1101 1101 ADD 1010 1100 1110 ACE 1011 1110 1010 1101 BEAD 1111 1110 1110 1101 FEED 1011 1010 1101 BAD 1100 1010 1011 CAB Binary Decimal Task 4 156 64 32 16 8 4 2 1 number value 110 1 1 0 6 11001 1 1 0 0 1 25 10101 1 0 1 0 1 21 100111 1 0 0 1 1 1 39 1011000 1 0 1 1 0 0 0 88 1100001 1 1 0 0 0 0 1 97 10101100 1 0 1 0 1 1 0 0 172 11111111 1 1 1 1 1 1 1 1 255

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 66 By Nichola Lacey Task 5 Binary number Decimal value 101 5 1101 13 10111 23 110010 50 1111100 124 1001001 73 10101110 174 11011001 217 Decimal Binary Task 6 156 64 32 16 8 4 2 1 value number 27 1 1 0 1 1 11011 33 1 0 0 0 0 1 100001 52 1 1 0 1 0 0 110100 63 1 1 1 1 1 1 111111 85 1 0 1 0 1 0 1 1010101 207 1 1 0 0 1 1 1 1 11001111 Task 7 Original number Converted number

11102 1410

1010 10102

10110 11001012

110011102 20610

11010 11011102

110000112 19510 Task 8 Binary Hexadecimal

1011 00112 B316

1101 01112 D716

11 01102 3616

111011112 EF16

11101112 7716

111110002 F816

101112 1716

101110102 BA16 Task 9 Hexadecimal Binary Decimal

1A16 110102 2610

B816 101110002 18410

F916 111110012 24910

4E16 10011102 7810

BC16 101111002 18810

5216 10100102 8210

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 67 By Nichola Lacey Task 10 Decimal Binary Hexadecimal

1210 11002 C16

4910 1100012 3116

7810 10011102 4E16

9510 10111112 5F16

11610 11101002 7416

25510 111111112 FF16 Task 11 Decimal Binary Hexadecimal

510 1012 516

13710 100010012 8916

9110 10110112 5B16

17310 101011012 AD16

2710 110112 1B16

23110 111001112 E716

22710 111000112 E316

16310 101000112 A316

20910 110100012 D116

6610 10000102 4216

12410 11111002 7C16

8810 10110002 5816 Task 12

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 68 By Nichola Lacey Task 13 1 1 0 0 1 0 0 1 1 0 1 1 1 0 + 0 1 1 0 + 1 1 1 1 + 1 1 0 1 1 0 0 1 0 1 1 0 0 0 1 1 1 0 1 1 1 1 1 1 1 1

1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 1 0 1 0 + 0 1 1 0 1 1 1 0 + 1 1 1 0 0 + 0 1 1 1 1 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 0 1 1 1 1 1 0 1

11110000 + 10011 = 1101011 + 10111 + 1111 + 111111 + 11111111 1101001 = = 1 1 1 1 0 0 0 0 + 1 0 0 1 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 + 1 1 0 1 0 0 1 + 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 1 1 1 1 1 1 1 0 1 0 1 0 1 1 1

Task 14

Original binary number New binary number

New

Original Instruction

24 0 0 0 1 1 0 0 0 *2 0 0 1 1 0 0 0 0 48 34 0 0 1 0 0 0 1 0 /2 0 0 0 1 0 0 0 1 17 50 0 0 1 1 0 0 1 0 *4 1 1 0 0 1 0 0 0 200 60 0 0 1 1 1 1 0 0 /4 0 0 0 0 1 1 1 1 15 38 0 0 1 0 0 1 1 0 *8 0 0 1 1 0 0 0 0 48 76 0 1 0 0 1 1 0 0 /8 0 0 0 0 1 0 0 1 9 9 0 0 0 0 1 0 0 1 *16 1 0 0 1 0 0 0 0 144 168 1 0 1 0 1 0 0 0 /16 0 0 0 0 1 0 1 0 10 Task 15 Original message Converted into binary [STX]LOL[ETX] 000 0010 100 1100 100 1111 100 1100 000 0011 1 2 000 0010 011 0001 000 1101 011 0010 000 1101 011 0011 000

3 0011 000 0010 100 0011 110 1111 110 1101 111 0000 111 0101 111 Computers! 0100 110 0101 111 0010 111 0011 010 0001 000 0011 Task 16 Character Decimal value Character Decimal value J 74 e 101 T 84 m 109

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 69 By Nichola Lacey Task 17 1. ASCII allows for 127 different characters and can be represented with a single byte of data, Unicode allows for far more characters. 2. It allows more characters to be used so can incorporate alphabets from around the world and symbols which are used in maths and other specialist areas to be used.

3. 004116 4. ④ Task 18 Item Resolution (written a DPI) The average computer monitor 72 – 96 iPhone 8 screen 326 An image in a glossy magazine 300 Task 19 In order to use a greater variety of colours for each pixel, more bits would be needed to store these larger numbers. Therefore, as more bits are required for each pixel the file size would increase. Task 20 The image size is determined by the number of pixels for the width and height of the image. If more pixels are used to give a greater clarity to the image, this would increase the file size of the image. Task 21 Width Height Colour Depth Number of bits Shorthand 1,280 860 24 bits 26,419,200 3 Mb 2,500 2,000 8 bits 40,000,000 5 Mb 150 475 16 bits 1,140,000 142.5 Kb 15,000 20,000 64 bits 19,200,000,000 2.4 Gb Task 22

Task 23 0110100110010110

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 70 By Nichola Lacey Task 24 1. A single picture element, shown as a small square which can contain a single colour. These are combined to make one large image. 2. In rows and columns 3. The image size refers to how many pixels make up the width and height of an image but the file size is a calculation to find out how many bytes are needed to save the image (width x height x colour depth) 4. This is the number of dots per inch (DPI). In other words, how many pixels fit into a square inch of an image. The higher the DPI the higher quality the image will be. 5. To save more colours a large number of bits are needed to save the large number. If there are more bits needed for each pixel and lots of pixels required, then the file size will be larger than if there is a small number of pixels or not many different colours needto be saved. 6. 630 Kb 7.

Task 25 Decimal Binary Decimal Binary Sample Sample number number number number 1 11 01011 11 9 01001 2 9 01001 12 17 10001 3 11 01011 13 8 01000 4 18 10010 14 10 01010 5 9 01001 15 3 00011 6 8 01000 16 18 10010 7 13 01101 17 17 10001 8 11 01011 18 14 01110 9 3 00011 19 12 01100 10 4 00100 20 7 00111 Track 26 Description Sample rate Sample resolution HDTV 48,000 Hz 24-bit Cinema Film 96,000 Hz 24-bit MP3 music track 44,100 Hz 16-bit Track 27 Sample Shortest Length of clip Sample rate Bits resolution notation 60 seconds 48,000 Hz 24-bit 69,120,000 8.6 Mb 2 minutes 44,100 Hz 16-bit 84,672,000 9.6 Mb 1 hour 20 mins 96,000 Hz 24-bit 11,059,200,000 1.3 Gb

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 71 By Nichola Lacey Task 28

Decimal Binary Decimal Binary Sample Sample number number number number 1 5 0101 6 9 1001 2 2 0010 7 3 0011 3 10 1010 8 11 1011 4 13 1101 9 1 0001 5 5 0101 10 8 1000 10 x 8 x 1 = 80 bits which is 10 bytes Task 29 Character Frequency Bitmap Path Number of bits n 1 101 3 c 3 100 3 j 3 11 2 t 5 01 2 r 5 00 2

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 72 By Nichola Lacey Task 30

Colour Frequency Bit Pattern Bits used Black 3 1011 4 Blue 9 1010 4 Yellow 18 100 3 White 19 01 2 Purple 22 00 2 Red 27 11 2

7 (width) x 14 (height) x 16 (colour depth) = 1568 bits (without Huffman code) or 1.5 Kb

(3 x 4) + (9 x 4) + (18 x 3) + (19 x 2) + (22 x 2) + (27 x 2) = 238 bits (using Huffman Code)

Task 31 Image sequence Run Length encoding 1 1 1 0 0 1 1 1 3 1 2 0 3 1 1 1 1 1 0 1 0 0 4 1 1 0 1 1 2 0 0 0 1 1 0 0 0 0 2 0 2 1 4 0 1 1 1 1 1 0 1 1 5 1 1 0 2 1 Task 32 Run Length encoding Image sequence 3 1 5 0 1 1 3 0 3 1 1 0 2 0 5 1 1 0 3 0 2 1 2 0 1 1

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 73 By Nichola Lacey Task 33 1. Files are compressed so they take up less space and can transferred across networks faster. 2. Lossless data compression means the data can be recreated exactly as the original file. Lossy data compression means the data is changed so cannot be recreated exactly the same as the original file. 3. Step 1: Find out the frequency of each character and put them into a table with the least frequent at the top. Step 2: Take the top two entries and draw nodes including the frequency and the character. Step 3: Draw another node to join them showing the total of the two nodes. Step 4: Add the node total to the sorted table in the correct place. Step 5: Keep repeating steps 2 to 4 until all the characters and nodes have been joined. Step 6: Add a 1 to the top branch of each pair and 0 to the bottom branch of each pair. Step 7: Follow the branches of each tree to discover the unique binary code for each character. Step 8: Work out the new binary sequence using the new codes to represent each character. 4. Sequences in which the same bit value occurs are stored as a single bit value and counted.

Teach yourself the fundamentals of data representation for AQA GCSE Computer Science Page 74 By Nichola Lacey