R Programming Module Overview Learning Objectives Learning
Total Page:16
File Type:pdf, Size:1020Kb
R Programming Module Overview The following module comprises of R programming basics and application of several Statistical Techniques using it. The module aims to provide exposure in terms of Statistical Analysis, Hypothesis Testing, Regression and Correlation using R programming language. Learning Objectives The objective of this module to make students exercise the fundamentals of statistical analysis in R environment. They would be able to analysis data for the purpose of exploration using Descriptive and Inferential Statistics. Students will understand Probability and Sampling Distributions and learn the creative application of Linear Regression in multivariate context for predictive purpose. Learning Outcomes After the successful completion of this module, students will be able to: • Install, Code and Use R Programming Language in R Studio IDE to perform basic tasks on Vectors, Matrices and Data frames. • Describe key terminologies, concepts and techniques employed in Statistical Analysis. • Define, Calculate, Implement Probability and Probability Distributions to solve a wide variety of problems. • Conduct and Interpret a variety of Hypothesis Tests to aid Decision Making. • Understand, Analyse, Interpret Correlation and Regression to analyse the underlying relationships between different variables. Unit I Introduction to R Programming R and R Studio, Logical Arguments, Missing Values, Characters, Factors and Numeric, Help in R, Vector to Matrix, Matrix Access, Data Frames, Data Frame Access, Basic Data Manipulation Techniques, Usage of various apply functions – apply, lapply, sapply and tapply, Outliers treatment. Unit II Descriptive Statistics Types of Data, Nominal, Ordinal, Scale and Ratio, Measures of Central Tendency, Mean, Mode and Median, Bar Chart, Pie Chart and Box Plot, Measures of Variability, Range, Inter-Quartile- Range, Standard Deviation, Skewness and Kurtosis, Histogram, Stem and Leaf Diagram, Standard Error of Mean and Confidence Intervals. Unit III Probability, Probability& Sampling Distribution Experiment, Sample Space and Events, Classical Probability, General Rules Of Addition, Conditional Probability, General Rules For Multiplication, Independent Events, Bayes’ Theorem, Discrete Probability Distributions: Binomial, Poisson, Continuous Probability Distribution, Normal Distribution & t-distribution, Sampling Distribution and Central Limit Theorem. Unit IV Statistical Inference and Hypothesis Testing Population and Sample, Null and Alternate Hypothesis, Level of Significance, Type I and Type II Errors, One Sample t Test, Confidence Intervals, One Sample Proportion Test, Paired Sample t Test, Independent Samples t Test, Two Sample Proportion Tests, One Way Analysis of Variance and Chi Square Test. Unit V Correlation and Regression Analysis of Relationship, Positive and Negative Correlation, Perfect Correlation, Correlation Matrix, Scatter Plots, Simple Linear Regression, R Square, Adjusted R Square, Testing of Slope, Standard Error of Estimate, Overall Model Fitness, Assumptions of Linear Regression, Multiple Regression, Coefficients of Partial Determination, Durbin Watson Statistics, Variance Inflation Factor. References 1. Ken Black, 2013, Business Statistics, New Delhi, Wiley. 2. Lee, Cheng. et al., 2013, Statistics for Business and Financial Economics, New York: Heidelberg Dordrecht. 3. Anderson, David R., Thomas A. Williams and Dennis J. Sweeney, 2012, Statistics for Business and Economics, New Delhi: South Western. 4. Waller, Derek, 2008, Statistics for Business, London: BH Publications. 5. Levin, Richard I. and David S. Rubin, 1994, Statistics for Management, New Delhi: Prentice Hall. Python Programming Module Overview Python Programming module is intended for students who wish to learn the Python programming language. This module is highly important so as to proceed with this programme. The module comprises of Programming basics with regards to Python Language such as Data Types, Operators, Functions, Classes and Exception Handling. Learning Objectives This module will help students gain much needed knowledge pertaining to Python Programming, so as to prepare them for the advanced modules such as ML. Python scripting is user-friendly and is the most used language in industry when it comes to designing and scripting applications with respect to Emerging Technologies. Learning Outcomes Upon successful completion of this module, students should be able to: • To understand why Python is a useful scripting language. • To learn how to use lists, tuples, and dictionaries in Python programs. • To learn how to write loops and decision statements in Python. • To learn how to write functions and pass arguments in Python. • To learn how to design object‐oriented programs with Python classes. • To learn how to use exception handling in Python applications for error handling. Unit I Introduction History of Python, Need of Python Programming, Applications Basics of Python Programming Using the REPL(Shell), Running Python Scripts, Variables, Assignment, Keywords, Input- Output, Indentation. Unit II Types, Operators and Expressions Types - Integers, Strings, Booleans; Operators- Arithmetic Operators, Comparison (Relational) Operators, Assignment Operators, Logical Operators, Bitwise Operators, Membership Operators, Identity Operators, Expressions. Unit III Data Structures and Control Flow Lists, Operations, Slicing, Methods, Tuples, Sets, Dictionaries, Sequences, Comprehensions, Conditional blocks using If, Else and El-if, For Loop, For loop using Ranges, String, list and Dictionaries, While Loop, Loop Manipulation using Pass, Continue, Break and Else, Conditional and Loops Block. Unit IV Functions Modules and Packages Defining Functions, Calling Functions, Passing Arguments, Keyword Arguments, Default Arguments, Variable-length arguments, Anonymous Functions, Function Returning Values, Scope of the Variables in a Function - Global and Local Variables. Creating modules, Name Spacing, Introduction to PIP, Installing Packages via PIP, Using Python Packages. Unit V Object Oriented Programming & Exception Handling Classes, Self-Variable, Methods, Constructor Method, Inheritance, Overriding Methods, Data Hiding, Difference between an Error and Exception, Handling Exception, Try Except Block, Raising Exceptions, and User Defined Exceptions. References 1. R.Nageswara Rao, 2018, Core Python Programming, Dreamtech. 2. John Hearty, 2016, Advanced Machine Learning with Python, Packt. 3. Jake VanderPlas, 2016, Python Data Science Handbook: Essential Tools for Working with Data, O'Reilly. 4. Mark Lutz, 2010, Programming Python, O'Reilly. 5. Tim Hall and J-P Stacey, 2009, Python 3 for Absolute Beginners, Apress. Structured Query Language Module Overview In this course, the students will learn the basics of the SQL/No SQL and the Relational Databases. They will learn about the Relational Model and Relational Model concepts and constraints. The students will get exposure to key concepts with regards to SQL Language and DBMS such as Normalization, Transaction Processing along-side an exposure to No SQL programming. Learning Outcomes This module will help students gain much needed knowledge pertaining to Relational Database Management Systems, Data Models, SQL query processing, Normalization along with an introduction to No SQL Database systems using Mongo DB. Learning Objectives • To understand the basic concepts and the applications of Database Systems. • To master the basics of SQL and construct queries using SQL. • To become familiar with the basic issues of Transaction Processing and Concurrency Control. • To become familiar with NO SQL Programming Language. • Explain the architecture, define objects, load and query data within No SQL databases. Unit I Introduction to Database Management Systems Introduction-Database System Applications, Purpose of Database Systems, Views of Data, Data Abstraction, Instances and Schemas, Data Models, Database Languages, DDL, DML, Database Architecture, Database Users and Administrators, Database Design, ER Diagrams, Entities, Attributes and Entity Sets, Relationships and Relationship sets, Integrity Constraints, Views. Unit II SQL Operators and Relational Theorems Relational Algebra and Calculus, Selection and Projection, Set Operations, Renaming, Joins, Division, Relational calculus, Tuple Relational Calculus, Domain Relational Calculus, Forms of Basic SQL Query, Nested Queries, Comparison Operators, Aggregate Operators, NULL values, Logical connectives, AND, OR and NOT, Outer Joins, Triggers. Unit III Normalization Problems Caused by Redundancy, Decompositions, Functional Dependencies, Normal Forms, First, Second, Third Normal forms, BCNF, Properties of Decompositions, Loss less Join Decomposition, Dependency Preserving Decomposition, Multi Valued Dependencies, Fourth Normal Form, Join Dependencies, Fifth Normal Form. Unit IV Transactions Transaction Management, Transaction Concept, Transaction State, Implementation of Atomicity and Durability, Concurrent, Executions, Serializability, Recoverability, Implementation of Isolation, testing for serializability, Concurrency Control, Lock, Timestamp Based Protocols, Validation Based Protocols, Recovery, Failure Classification, Storage Structure, Atomicity, Log Based Recovery, Remote Backup Systems. Unit V No SQL Overview of No SQL, Types of No SQL Databases, No SQL Storage Architecture, CRUD Operations in MongoDB, Querying, Modifying and Managing No SQL Databases, Indexing and Ordering, Migrating from RDBMS to No SQL, No SQL in Cloud,