Computational

Abhishek De Chennai Mathematical Institute What is complexity?

Before that, we need to know what is the difference between:

Problem

Problem Instances

Algorithm PROBLEM: A problem is an unsolved question proposed to us. It contains an abstract description, probably a real-life situation.

Examples:

. Gie todas huidit, ill it ai toda? . What is the minimum time required to reach from my house to ISI during rush-hour? . Is 2004 a leap year? . What is the average score of Gambhir in IPL-7 given the list of his scores?

Types of problems • Returns a Boolean value. Decision (True or False) Problems •

• Returns a single output which could be of any Functional type. Problems • More tougher to study than decision problems. Class of all Problems • Searches for a structure 'y' in an object 'x'. Search • Also podues a itess Problems if search is successful. (Sometimes this is a separate problem.)

• Given a set of parameters Optimization poides the est Problems possile solutio hopefully without exhausting all cases. PROBLEM INSTANCE: The instance is a particular input to the problem, and the solution is the output corresponding to the given input. For example, consider the problem of primality testing. The instance is a number (say, 15) and the solution is "no". Thus the input string for a problem is referred to as a problem instance, and should not be confused with the problem itself because a particular instance cannot in general shed light on the geeral approach to solve the problem.

Note that a problem can be viewed as an infinite collection of instances together with a solution for every instance i.e. the problem is solved if every (instance, solution) pair is known. : An algorithm is an effective step-by-step method expressed as a finite list of well-defined instructions to solve a problem. Loosely speaking, it is the geeal appoah to sole the pole hih works for any instance of the problem.

For example: Euclid’s Algorith to calculate the g.c.d of two numbers is one of the most . Types of Algorithms

• Same set of inputs render Deterministic same sets of outputs. • Probability of success is 1.

Algorithms

• Uses randomness to guide its behaviour. Same set of inputs may give Randomized • different set of outputs • Probability of success is not trivially 1. Finally, what is complexity?

Suppose you have two algorithms A and B to solve a problem. You at to use the ette oe. What is the eaig of a ette algorithm?

We sa, A is ette tha B if opleit of A is less tha opleit of B.

Definition. (Complexity) : Computational Complexity of an algorithm is the number of steps required to complete the task given the size of the input parameter, quantifying the amount of resources needed to solve them, such as time and storage.

Input parameter

Input parameter is the parameter which quantifies the complexity. The complexity is only the function of the input parameter.

“uppose oue sotig a aa. The ue of steps euied will solely depend on the number of elements in the array. So input parameter is the size of the array.

Again, for number-theoretic algorithms, the input parameter is the measure of how big the number is. So it is number of bits occupied by the number.

Complexity Models

Temporal Spatial

• Temporal – Time required to complete the task • Space – Space/ Memory required to complete the task • Other complexity measures are also used, such as the amount of communication, the number of gates in a circuit and the number of processors . Thus one of the roles of complexity theory is to determine the practical limits on what can and cannot do. For example it is easy to estimate using complexity that even with the latest technology it is impossible to calculate 280.

(Here we will only discuss ) Program running time – Why?

When is the running time (waiting time for user) noticeable/important?

• web search • database search • real-time systems with time constraints • Gene matching (DNA searches)

Performance is NOT Complexity

• Since we have now come into the realm of time complexity, there is an a priori notion that the faster a program is, the lesser complexity it has. Well, not always. • Running time depends on a lot of things like memory, disk space, which depend on machine, compiler, etc. This is the pefoae of the algorithm. • Complexity should be independent of CPU time and implementation. • Thus complexity affects performance but not the other way around.

The Big-Oh notation • Look at the picture along- side. Similarly we need to compare the theoretic running time of two algorithms. • Let f(x) and g(x) be the running time of two algorithms, where x is the input parameter. • f(x) = O(g(x)) if there exists constant c and k such that f(x) cg(x) for all x k.

What does this mean?

• It eas asptotiall f is less tha eual to g i.e. as x tends to infinity f(x) is smaller than g(x). • One thing to notice here is e do’t care for sall alues of x. • Like here f(x) is still asymptotically smaller than g(x). : A function of the input parameter which describes the rate of growth of an algorithm or any function.

Some Formulae to be kept handy:

• O(cn) = O(n) for any constant c • O(mn) = O(m)O(n) • O(m+n) = O(max(m,n))=O(O(m),O(n))

Lets get ou hads dit… How do we actually calculate the time complexity of a given program in terms of O(.)? A program is like a sequence of statements: statement 1 statement 2 … statement k Now the complexity of the sum of complexity of each statement. For that we need to understand first, what are basic operations. Basic Operations

Basic operations are a set of traditional computations which is regarded to be constant- time i.e O(1). Typically, they are easy to implement in hardware. Examples:

• one-bit addition (oue alead see addes), multiplication, etc. All operations • assignments, conditions [x=0 and x == 0] • Read and write primitive variables

• Now say you have a loop: int X = 100; for(int i=1;i<=n;i++){ X++; } Let us deconstruct the situation. There are two assignments [X=0 and i=1], n condition checking and 2n incrementations [n times i and n times X]. So, complexity = 2O(1) + nO(1) + 2nO(1) = O(n) • Similarly for a nested loop (2 levels), we have a complexity of O( ). 2 � But does this really work??

Ist O = O ) by definition? We need a bound on both sides.2 So we introduce the symbol θ. If f(x)� = θ(g(x)) then f(x) = O(g(x)) and g(x) = O(f(x)). So it is an envelope on both sides.

However, probably, due to historic reasons, O(.) and θ(.) are used interchangeably. We will thus stick to O(.) with the catch that O(n) O( ). 2 ≠ �

If O(.) denotes number of steps, how can there be something like O(log n)? Consider the following program: while(n>0){ n = n/2; } Let the running-time be T(n). Then, we have T(n) = T(n/2) + O(1) Take n = 2k. So, T(2k) = T(2k-1) + O(1) T(2k-1) = T(2k-2) + O(1) … k equations T(2) = T(1) + O(1) ------T(2k) = O(k). So T(n) = log n (in base 2)

Master Theorem

All recursions cannot be solved by algebraic manipulation. There is shortcut way:

If the time-complexity of the function is of the form

Then the Big-Oh of the time-complexity� � is: � � = � + � � • If , then • If �, then log� � > � � = � � • If �, then � = � � = � � . log � This is known� as the Master Theorem� . < � � = � � Complexity Classes

• O(1) – Constant Time • O(log n) – Logarithmic • O(n) – Linear Time • O( ) – Polynomial Time for any constant c • ------� � • O( ) – Exponential Time for any constant c �

Compare running time growth rates A quick reference table, so that you know whether your algorithm will work.

• Worst case Running Time: The behavior of the algorithm with respect to the worst possible case of the input instance. The worst-case running time of an algorithm is an upper bound on the running time for any input. Knowing it gives us a guarantee that the algorithm will never take any longer. • Average case Running Time: The expected behavior when the input is randomly drawn from a given distribution. The average-case running time of an algorithm is an estimate of the running time for an "average" input. Computation of average-case running time entails knowing all possible input sequences, the probability distribution of occurrence of these sequences, and the running times for the individual sequences. Often it is assumed that all inputs of a given size are equally likely. • Amortized Running Time: Here the time required to perform a sequence of (related) operations is averaged over all the operations performed. Amortized analysis can be used to show that the average cost of an operation is small, if one averages over a sequence of operations, even though a simple operation might be expensive. Amortized analysis guarantees the average performance of each operation in the worst case. • Best case Running Time: The term best-case performance is used in science to describe an algorithm's behavior under optimal conditions.

Role of constants

Constants are usually ignored in Big Oh notations. Thus mathematically O(n) = O(100000000n). But what is the role of constants and can we really ignore them? It is important to notice that constants are independent of input parameter i.e. n. So what do they depend on?

They depend on: • Memory access speed • CPU/processor speed • Number of processors? (if multi-threading or multiple/parallel processes are used) • Compiler optimization (~20%) Thus constants matter if the input size is small. (Refer to graph) Space-Time tradeoff

A space-time or time-memory tradeoff is a situation where the memory use can be reduced at the cost of slower program execution (and, conversely, the computation time can be reduced at the cost of increased memory use). As the relative costs of CPU cycles, RAM space, and hard drive space change—hard drive space has for some time been getting cheaper at a much faster rate than other components of computers the appropriate choices for space–time tradeoffs have changed radically. Often, by exploiting a space–time tradeoff, a program can be made to run much faster.