Data Science Catalog

Data Science Education

 

           

 Nov 2020 1  www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

TABLE OF CONTENTS

 OVERVIEW 2

 CURRICULA AT-A-GLANCE 3  CERTIFICATION PROGRAMS 4  ENTERPRISE SOLUTIONS 5-6  COURSE DESCRIPTIONS 7•20

 OUR INSTRUCTORS 21  OUR CUSTOMERS 22  CONTACT US 23  PRICE LIST 24  ABOUT ELEARNINGCURVE 24

I have come across quite a few on-line and classroom training course over the years. I have no hesitation in saying that, these courses are one of the best in the industry. -- Sumanda Basu, CIMP Ex - Data Governance, Data Quality, USA

2 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Full course descriptions begin on page 7.

Data Science Fundamentals Diagnostic Analytics Using Statistical Process Control This course will cover the key concepts and practices needed for a successful data science program. This 4-hour course provides an introduction to the concepts, techniques and applications of SPC within the AI Fundamentals context of information management.

Prescriptive Analytics Using Simulation Models This course will present the basics of AI from history to modern AI with the illustrative applications of endless possibilities. This 4-hour online training course provides an introduction to prescriptive analytics using simulation models applied to areas that are relevant to business analysts, operations Framing and Planning Date Science Projects planners, decision makers, functional managers and BI that Drive Business Impact team members.

This 3-hour online course addresses how to scope, plan, Data Mining in R and choose a project approach for analytics project success and clearly identify the problem and opportunities to be This 3.5 online training course will show you how to use R analyzed. basics, work with data frames, data reshaping, basic statistics, graphing, linear models, non-linear models, Data Understanding and Preparation for clustering, and model diagnostics. Data Science Introduction to NoSQL This 3-hour online course addresses how to translate the problem statement into data sources, explore the data for This 3.5-hour online course addresses the emerging class relationships and recognize patterns, identify the starting of NoSQL technologies for managing operational big data. inputs for the model, preparing data, and validating it for the This includes key-value, column stores, document stores model fitting process. and graph databases.

Data Mining Concepts and Techniques Streaming Data: Concepts, Applications, and Technologies This 3-hour online course will give insight into the data mining process, explain models and algorithms, and give an This 3-hour course covers the concepts, applications, and understanding of how to match the right data mining models business and technical drivers for streaming data adoption, to the right problems. including an in-depth discussion of Apache Kafka.

Hadoop Fundamentals Analytical Modeling, Evaluation, and Deployment Best Practices This 5-hour online course provides an introduction to Hadoop and its inner workings and how the ecosystem was This 3-hour course focuses on how to match the business created to answer several questions for a world driven by problem to candidate algorithms, produce comparable data and e-commerce. models, choose the best performing model, and once in

production, what to do to address ongoing value. Putting the Science in Data Science Data Strategy for the Age of Big Data This 3-hour online course provides an overview of the scientific method within the context of solving business This 3-hour course covers the core principles of building a problems with the goal of introducing the key concepts, big data strategy to generate the business value and deep tools, and skills for practice. insights that an organization needs to thrive in a competitive business environment.

3 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

CIMP: Demonstrate Mastery. Achieve Success.

4 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

          

5 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

6 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Data Science Fundamentals Course Outline

Data science has matured into a cross functional discipline. In simple terms, its main purpose is to extract meaningful o Basic Concepts information from a variety of data sources. This definition is o Value Chain Analysis very general and must be explored in more detail to o understand the building blocks needed for success. Related Thinking Styles workgroups must understand each other and work together to o Research Methods make meaningful impact. o

Effective data science is a critical enabler for companies to o Data Science Concepts become “data-driven” and to “compete on analytics”. To give o Aspects of Science in Data Science shape to data science as a discipline, this course introduces o Value Framework core principles and concepts to provide a solid foundation of o Module Summary understanding. Data science is described in terms of its, purpose, capabilities, techniques, approaches and skills. It’s dependencies on other disciplines and how it enables value creation within the broader “data-driven” ecosystem is also o Pursuit of Value provided. o Data Driven Organizations o Success Factors This course introduces data science and sets the stage for understanding how process, data, skills, culture, methodology and technical building blocks collectively drive results. o Big Data – The Open Catalyst o Data Resources  Key concepts needed for successful data science o Data Management  How data science relates to other related disciplines o Discovery and Exploration  Practical data science process lifecycle steps o Model Building  Common data science tools, techniques and o Model Execution and Analysis modeling categories o  Recommended data science approaches, methods Interpretation and Storytelling and processes  The data science process  Critical success factors for data science o Problem Framing  Why organizational culture and data literacy are o Research Methods challenges that must be managed o Modeling Techniques

o Model Deployment  Business managers and executives  Technology managers and executives  Data science and data engineering team members  Business analysts, statisticians and modelers  Process managers and decision makers  Business measurement and performance analysts  IT analysts and developers  Data management analysts  Technology and business architects  Analytics, business intelligence, data science and data engineering program leaders  Anyone with an interest in understanding the capabilities, opportunities and challenges offered by data science

7 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Framing and Planning Data Science o What is a Charter? Projects that Drive Business Impact o Benefits of a Charter o Project Approach and Outcomes o Success Measures o Assumptions Data Science projects often fail due to unclear scope, lack o Resources of project planning, and lack of clear alignment to business o Constraints objectives. This 3-hour online course addresses how to o Milestone Schedule scope, plan, and choose a project approach for analytics o Budget project success and clearly identify the problem and o Project Stakeholders opportunities to be analyzed. Framing and planning drives all of the other phases of data science projects. Based on o Data Science Key Roles the CRISP-DM analytics lifecycle this course describes the purpose, activities, and deliverables for the first phase of that lifecycle. o Data Science Methodologies o Cross Industry Standard Practices o Team Data Science Process  Clearly define a problem statement or question of o Agile SCRUM Process interest o Data Science Methodologies review  Define an analytic project including scope and o Data Science Analytical Technology methodology approach o Data Science Infrastructure Technology  Create a project plan to manage the analytics project  Establish stakeholder management and expectations o Project Type and Maturity o Problem Statement to Project Type o ABC Airlines  Data scientists, data analysts, and business o Other Project Considerations analysts who need to frame analytics problems o Scope & Key Deliverables and choose the most effective ways to solve those problems  Aspiring data scientists and data analysts  Business and technical managers who need to o Data Science Project Plans understand the nature of analytics and data o Project Kickoff science work o Data Science Project Plan Image  Data engineers and analytics developers who o Project Plan Work Packages work with data scientists o Data Science Project Plan Review

Course Outline o Data Science System Architecture Data Science System Architecture o Data Science Data Definitions o Data Science Summary Report o Data Science Model Report Part 1-2 o Data Science Exit Report: Overview o Data Science o Data Science Exit Report: Benefits o Problem or Opportunity o Data Science Exit Report: Learnings o Thinking Styles o The Stage Process o Influence Modeling o Kernel Seeking Modeling o Causal Modeling o Characteristics of a Good Problem Statement o Define the Problem or Opportunity Example

8 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Data Understanding & Preparation for o Data Modeling and Data Science o Data Pipelines and Data Stores Data Science o Work with the End in Mind

o Data Understanding One challenge in the data science lifecycle is understanding the problem or opportunity, the next challenge is acquiring, o Exploratory Data Analysis understanding, and preparing data for the modeling phase. o Data Understanding – Data Profiling This step in the data science process is estimated to take up o Sampling Size to 50% of the time allotted for a data science project. o Data Profiling – EDA Methods o Sample Quality This course addresses how to translate the problem o Statistics Basics: Attributes statement into data sources, explore the data for o Summary Statistics relationships and recognize patterns, identify the starting o Distribution inputs for the model, preparing data, and validating it for the o Data Relationships model fitting process. o Results of Data Profiling o Findings – Important Variables o Outcomes and Interpretations  Review the data science project methodology o EDA Checklist  Understand data source identification  Evaluate data findings to determine and validate modeling techniques  Review feature selection techniques o Data Preparation  Understand data preparation techniques o Feature Selection  Planning for data pipelines o Data Quality Report  Understand data visualization techniques for data o Feature Scaling and Standardization understanding and data preparation o Subset Selection o Transform for Data Modeling o Data Ready State  Review the data science project methodology o Modeling – Create and Train a Model  Understand data source identification o Cross Validation  Evaluate data findings to determine and validate

modeling techniques  Review feature selection techniques  Understand data preparation techniques o What are Data Pipelines?  Planning for data pipelines o Why are Data Pipelines Important?  Understand data visualization techniques for data o Data Pipelines and Data Science understanding and data preparation

Course Outline o Data Description o Data Evaluation o Data Quality Report o Data Cleansing o Data Quality Scorecards and Dashboards o Feature Ranking o The Data Science Process o Supervised Learning o Unsupervised Learning o Define the Problem or Opportunity o Data Quality and Machine Learning o Define the Data Sources o Accurate Data o Consistent and Complete Data o Algorithms and Data Quality o Bins and Ranges o Data Understanding o High Cardinality o Data Sources and the Problem Statement o Reduce Cardinality o Data Source Inventory o Dealing with Outliers o Preparing for Exploratory Data Analysis o Missing Data o Time of Event 9 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Data Mining Concepts and Techniques o Data Mining Framework o Data Mining Approaches o Data Mining Techniques Data mining originated primarily from researchers o Data Mining Process running into challenges posed by new data sets. Data mining is not a new area, but has re-emerged as data science because of new data sources such as Big Data. o Exploratory Data Analysis This course focuses on defining both data mining and o Data Profiling: Uncovering Structure data science and provides a review of the concepts, o Data Profiling: Types of Profiling processes, and techniques used in each area. o Descriptive Statistics o Results of Data Profiling Data Relationships This 3-hour online course will give you insight into the o Findings – Important Variables data mining process, explain models and algorithms, o Visualization Techniques and give an understanding of how to match the right o Outcomes and Interpretations data mining models to the right problems. o Sampling Size o Sample Quality o Big Data Considerations  The definitions of data mining and data science o Feature Selection  The role of statistics in data mining o EDA Checklist  Machine learning concepts  To differentiate between supervised and unsupervised learning o Build the Model  The data mining process o Anatomy of a Model  How to conduct exploratory data analysis o What is a Classification Problem  To identify data mining models and algorithms o Classification  How to match the problem with the model o Ensemble Methods  Model validation techniques o Clustering  How to deploy data mining models o Clustering Uses o Association−Market Basket o Association Rules  Analysts looking to gain foundational data o Association Uses mining knowledge o Application of Data Mining Models  Analysts looking to understand data mining o Model Selection models  Analysts looking to apply the right data mining models to the right problem o The Validation Process  Attendees should have a basic understanding of o Fitting a Model undergraduate statistics, data types, databases, o Bias/Variance Tradeoff and data management concepts o Regression – Mean Squared Error o Linear Regression – Confidence and Course Outline Prediction Intervals o Logistic Regression – Significance Test o Classification Accuracy o Classification Accuracy – Other Measures o Prediction Error Methods o Hold-Out Cross Validation o Module Overview o K-Fold Cross Validation Method o What is Data Mining? o Statistics in Data Mining o Machine Learning o Overview o Supervised Learning o Deploying Data Mining Models o Unsupervised Learning o Course Summary Parts 1 & 2 o References

10 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Hadoop Fundamentals o Yahoo o Hadoop Creation History o Hadoop Today o Facebook o LinkedIn The world of data has transformed into an economy that can provide several insights to thrive in the world of business. The need of the hour is to ingest and acquire data as fast as possible, and more important than the acquisition, is the o Perspective – Food For Thought ability to fail fast and move at agile speeds to provide better o Why Hadoop? data insights with analytics. How can we do this without a o Human Behavior – New Insights database? The answer is an introduction to the world of o Twitter Example Hadoop. o Forces Shaping Business o Conundrum Implementing big data platforms for data exploration, o The New Data Fabric discovery, and analytics within a Business Intelligence (BI) program provides capabilities to leverage existing BI o Big Data programs and add new insights and methods that relate to, o CIO Continuum and process data for, the enterprise. o Architect’s Thinking o User Needs This 5-hour online training course introduces Hadoop and its o State of Data inner workings and how the ecosystem was created to o What is Apache Hadoop answer several questions for the world driven by data and e- o Hadoop Design Goals commerce. We have tailored this course to be focused on o Stack areas that are relevant to business analysts, decision o Core Components makers, functional managers and BI team members. The o Storage: Hadoop Distributed File basic concepts are introduced and the course is optimized to provide an overview of the breadth of potential opportunities System (HDFS) for Hadoop within diverse organizations.

o Compute: MapReduce and Yet Another  Limitations of databases Resource Navigator (YARN)  Search and Google ecosystem growth o Operating System: YARN  Apache Nutch o Services Management: Zookeeper  Hadoop ecosystem o Hadoop 3.0: More Enterprise Features  Hadoop internals  Hadoop 1,2 and 3

o  Architects, developers HBASE: Columnar Database  Business analytics team members o PIG: Dataflow  Executives, decision Support Teams o TEZ: Accelerator o In-memory: Spark o Data Ingestion: AVRO Course Outline

o Apache Hive o Impala o Speed in Compute o Apache Drill o Internet o Security in Hadoop o American Online (AOL) – Way to Connect to the o Workflow Internet o Hadoop Technical Architecture o Netscape – Popular Dot-Com Portal o Google – The First Search Engine o Search Process o Nutch

11 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Putting the Science in Data Science Course Outline

Research methods are critical to the process of addressing business challenges and finding solutions. Research methods and experimental design can be used across multiple disciplines to help you answer questions and inform decision o Defining Research making. In order to do so, it is essential that you ask the right o Key Features of the Scientific Method questions, identify and use the tools that will gather rich data sources, analyze the results using statistical and coding strategies, and apply visualization techniques to illustrate and represent the findings. o What Do We Mean by “Experiment”? o Designing an Experiment: A “How To” Drawing on business analytics, this course will use a scientific Approach approach to introduce the concepts, tools and skills that are o Now What? critical to designing and executing experiments to solve o Analyze Your Data business problems. The process of using research methods o Make Conclusions to ask questions, design experiments to test a hypothesis, o Share Findings identify data collection methods and techniques, and analyze the results to find business solutions will lead to more informed decision making. Using the scientific method, this process involves gathering data, determining what to do with o Collecting Data it, and deciding how to visualize and illustrate what you have o Surveys learned. o Conducting Observations o Interviews This course provides an overview of the scientific method o within the context of solving business problems with the goal Open Data of introducing the key concepts, tools, and skills for practice. o Data Collection Tips It also introduces the critical human aspects, including team o Readiness Checklists composition and the soft skills that will help you communicate the findings and publish your results. To apply the concepts learned, examples will be introduced throughout and a case o Getting Started study will be used to summarize the course. o Work Backwards o How to Give Life to Your Numbers o Know Your Audience  The key features of the scientific method o Use Appropriate Language  How to design an experiment o Making the Most of Data Visualization  Criteria for selecting data collection methods o Charts to Visualize Data  Strategies to analyze experimental results o  How to launch and execute an experiment, including Writing Your Report key factors to consider o Digital Storytelling  Examining your results  Approaches to visualize finding and communicate results o Setting the Scene  To apply the scientific method within a business o Identify the Gap context o Ask the Question o Multi-Phase Approach  Data Science team members o Evaluating Impact: Experimental Design  Business Intelligence professionals o Collect Data: Data Collection Tools  Data Analytics practitioners o Key Data To Measure  Business Analysts o Recruiting Participants  Process improvement professionals o Experimental Design Overview  Functional business managers o Qualitative Research Design  Business transformation leaders o Analyze Data: Data Analysis Approach  Data management professionals o Results–What Did We Find?  Data governance team members o Visualizing the Results  Operational and strategic planners

12 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Diagnostic Analytics Using Statistical Course Outline Process Control

o Basic Definitions The field of Diagnostic Analytics includes the capabilities to o Understanding Variation detect abnormal conditions and to estimate root causes to o SPC and Quality Management those conditions. This course is focused on the “detection” aspect of diagnostic analytics and introduces Statistical Process Control (SPC) as a suitable approach for defect detection. o Basic Statistics Root cause analysis of the identified defects is beyond the o Control Chart Fundamentals scope of this course. o Types of Control Charts o Control Chart Design Considerations SPC includes a set of analytical techniques that measure and detect abnormal changes in the behavior of a managed process. SPC helps managers respond to unexpected changes in critical variables and take corrective action as o Application Areas of SPC necessary to maintain the desired levels of product quality o Role of SPC in Process and process performance over time. Management o Operations Improvement Example This online training course provides an introduction to the o Real Time Process Monitoring concepts, techniques and applications of SPC within the Example context of information management practices and o Master Data Interface Monitoring processes. The theory of SPC is introduced and the design Example of control charts is discussed as a basis for describing how a o Data Quality Monitoring Example diverse range of data and process quality management o Business Performance Monitoring challenges can be addressed. Example

 Methods for detecting defects/abnormal conditions  Define some common process building blocks o Improving Control Chart  The concepts and theory behind “statistical control” Performance  How statistical methods can be used to measure o Analyzing Process Capability and estimate process variation  Identify and categorize major causes of process variation  How process variation is related to product quality  The principles of control charts used to detect and generate process alarms  Present the basic concepts of quality management initiatives and practices and how it relates to the scope of Statistical Process Control  Describe how to apply solutions to address process, data and related quality management challenges  Provide the context necessary to implement effective solutions

 Data quality analysts  Data Integration specialists  Process improvement analysts  Business analysts  Data warehousing team members  BI program managers and team members  Functional business managers  Anyone who wants to learn how statistical concepts can be applied to improve the quality of data and information and its various management processes

www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Prescriptive Analytics Using Course Outline Simulation Models

o Basic Concepts Prescriptive analytics enables managers to explore o Capabilities of Simulation different scenarios and evaluate new business o Business intelligence Framework opportunities by playing the “what-if” game. It enables the o Simulation Framework evaluation and comparison of different options as part of the decision making process. This leads to a deeper understanding about how to define and achieve business and operating goals. o Context and Opportunities o Application Areas Implementing prescriptive analytics using simulation o Systems Models methods within a Business Intelligence (BI) program o Model Components provides additional capabilities to existing BI programs. o System Simulation Answers to advanced business questions starting with "why" and "what if" can now be answered. Maintaining the models in a calibrated and reliable manner over time requires rigorous data management practices based on o Overview principles of integration and quality. o Continuous Physical Models o Business Process Models This 4-hour online training course provides an introduction o to prescriptive analytics using simulation models applied to Stock and Flow Models areas that are relevant to business analysts, operations planners, decision makers, functional managers and BI team members. The basic concepts are introduced and a o Monte Carlo Models framework is provided that positions simulation analytics o Discrete Event Models within a broader BI Program. Categories of models are o Empirical Models described that provides an overview of the breadth of potential opportunities for prescriptive analytics within diverse organizations. o Opportunities and Techniques o Data Management Considerations  Basic capabilities of simulation o Simulation and the BI Program  Categories of models and modeling techniques o Case Study  Domains of applicability  How to build and implement simulation models  Data management requirements for simulation  How business problems can be defined and solved  The role of experimental design  How insights can be generated  How to explore and discover possible routes to successful outcomes  How business intelligence, analytics, and simulation are related disciplines

 BI program leaders  BI architects and project managers  Business analytics team members  Business managers and decision makers  Functional analysts  Operations managers  Process improvement specialists

14 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

o DFS Block Placement Example Introduction to NoSQL o File System Summary

o NoSQL Inspirations In this informative class, learn about the emerging class of o NoSQL History NoSQL technologies for managing operational big data. This o Google MapReducer Paper includes key-value, column stores, document stores and o Google Bigtable Paper graph databases. Learn about the ideal workloads for NoSQL o Memcached in enterprises and where NoSQL adds value to an enterprise o Schemaless information strategy. Learn how to get the projects started or o Keeping it Simple dropping the “not in production” label. o CAP Theorem Part 1 & 2 This “code-lite” session addresses the NoSQL community as o Automatic Sharding well as the key user community, providing guidance on how o NoSQL Node Specification NoSQL technologies work and how to penetrate the enterprise. This practical session will help you add a significant class of technologies into consideration to ensure o Data Integration information remains an unparalleled corporate asset. o Data Visualization o Infrastructure Strategy, Including Cloud o Traditional Data Modeling  Big data basics o Data Modeling for NoSQL  Enablers for NoSQL o NoSQL is for Applications, Not DW or ERP  NoSQL data models: key-value, document, graph  NoSQL usage patterns o NoSQL Schemaless Data Modeling  NoSQL database architectures o Force Fitting Unstructured Data NoSQL  Graph database modeling and architecture Modeling from RDBMS o Security Concerns o Easing Into Change o What Will Motivate IT to Adopt NoSQL?  Big data basics

 Enablers for NoSQL  NoSQL data models: key-value, document, graph  NoSQL usage patterns o Data Types and NoSQL Data Models  NoSQL database architectures o Key Value Stores  Graph database modeling and architecture o Document Stores o Column Stores Course Outline o Operational Big Data Platform Solution o Multiple NoSQL Solutions

o Module Overview o The Graph Database Revolution o No More One Size Fits All o Relationship Data o The No Reference Architecture o Graph Algorithms o The Relational Database Data Page o Use Cases o What Does Big Data Mean? o Graph Modeling o Google Search Trends o Property Graph Databases o Why the Sudden Explosion of Interest o Semantic Graph Databases o What Happens in an Internet Minute? o Graph Engines o Sensors Data o o Customer Demands Drive Technology o New Data Types o Benefits of JSON o Overview o Why NoSQL for Big Data? o Questions for your NoSQL Prospect Vendor o ACID o Future of NoSQL o Hadoop, MapReduce and Big Data o Big Data and NoSQL Sales Projection o Why NoSQL Not Hadoop for Operations o The NoSQL Challenge o MapReduce Part 1 & 2 o Getting Started o Scale Up vs. Scale Out o What Technology to Select 15 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Data Mining in R Course Outline

With increasing interest in big data, the topic and skills of data mining get new attention, including strong o Overview interest in the value that can be derived from large data o What is R sets. Data mining is the process of selecting, exploring, o What is RStudio and modeling large amounts of data to uncover o Why RStudio? previously unknown information for business benefit. o Navigating RStudio

o R Environments R is an open source software environment for statistical computing and graphics and is very popular with data scientists. R is being used for data analysis, extracting and transforming data, fitting models, o Overview drawing inferences, making predictions, plotting, and o R Math reporting results. This online training course will show o R Data Types you how to use R basics, work with data frames, data o Working with Data Structures reshaping, basic statistics, graphing, linear models, o Loading Data non-linear models, clustering, and model diagnostics. o Writing Data o Summary

 R basics such as basic math, data types, vectors, and calling functions o Overview  Advanced data structures such as data frames, o Exploratory Data Analysis lists, and matrices o Base Graphics in R  R base graphics o Linear Regression  R basic statistics, correlation, and covariance o Logistic Regression  Linear models such as decision trees and o Summary random forests  To apply clustering using K-means  Model diagnostics o Overview o R Math  Data analysis and business analytics o R Data Types professionals o Working with Data Structures  Anyone interested to learn data mining o Loading Data techniques to find insights in data, and who has o Writing Data some statistical and programming experience o Summary

16 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Artificial Intelligence Fundamentals

o What is AI? o Artificial Intelligence in Real Life Artificial Intelligence (AI) is a field that is continually and o AI Popularity actively growing and changing, expanding human capability o What Drives AI Popularity? beyond imagination. AI has captured the attention of o AI is Enjoying Significant Hype and scientists, engineers and business people across industries Investment worldwide exploring ways to find insight and create value o Definition of AI from smart applications and product. o AI Purpose and Goals

This course presents the basics of AI from history to modern o Short History of AI AI with the illustrative applications of endless possibilities. o The Foundations of AI We will examine how AI already impacts every aspect of our daily lives and explore emerging AI based technologies with examples of applications and implications as well as o What is Intelligence Composed Of? opportunities. To understand some of the deeper concepts, o What AI Should be Able to Do? such as natural language processing, face recognition and o Types of AI Systems: Parts 1 & 2 autonomous driving, we will explore several basic AI o concepts: four major types of AI as well as machine learning, Act Rationally/Act Like a Human logic and planning, probabilistic technology, deep learning, o Systems that Act Like Humans and neural networks. o Systems that Think and Act Rationally o Systems that Act Rationally

o Demonstrate fundamental understanding of artificial intelligence (AI) and its foundations. o Search Algorithms Breadth-First Search o Apply basic principles of AI to problem that require o Depth-First Strategy inference, perception and learning. o Bidirectional Strategy o Demonstrate awareness and a fundamental o Uniform Cost Search understanding of various applications of AI techniques in intelligent agents, expert systems, o Comparison of Strategies artificial neural networks, autonomous vehicle and o Informed or Heuristic Search other machine learning models o Local Search Algorithms o Demonstrate an ability to share in discussions of AI, its current scope and limitations, and societal impact o Planning o Formal Definition of Planning o Business managers and executives o Basic Planning Algorithms o Technology managers and executives o Formalizing Planning Problems o Data science and data engineering team members o Forward Chaining Planning o Process improvement professionals o IT analysts and developers o Backward Chaining o Data management analysts o Expert Systems o Technology and business architects o Analytics, business intelligence, data science and data engineering program leaders o Learning o Anyone with an interest in understanding the o Deep Learning capabilities, opportunities and challenges offered by o Natural Language Processing artificial intelligence o Robotics o Self-Driving Vehicles Course Outline o Self-Driving Cars Impact o AI Issues o Ethical Issues in AI

o Introduction o Ubiquitous AI is the “New Normal” o Great Benefits of AI Across Industries o AI In Everyday Life 17 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Streaming Data: Concepts, Course Outline

Applications, and Technologies

o Batch Vs. Real Time o The Speed of Business The analytics opportunities with IoT and application data o The Speed of Data Parts 1-3 streams are abundant, but the value of streaming o Fast Data Drives the Consumer World technology is not limited to native data streams. In today’s o Fast Data Drives the Business World fast paced business world, the need for fast data is o pervasive and tacit acceptance of high-latency data is Fast Data Drives the Analytic World rapidly diminishing. Streaming as an alternative to batch o The Business Case for Fast Data

ETL is a practical way to meet the demand for fast data.

Change Data Capture (CDC) is a category of technology o Data Pipeline Processing that captures data about changes made to a database – o Data Stream Processing inserts, updates, and deletes – and makes that data o IoT and Edge Computing available to downstream processing such as data pipelines that flow to data warehouses and data lakes. CDC can be combined with streaming to accelerate data o Data Management – A New Streaming Use flow and reduce data latency. Case o What is CDC? You’ll need to know the actions and responsibilities of data producers and of data consumers, as well as the capabilities for cluster management, data connections, and APIs. Integrating Kafka or other streaming technologies into your data ecosystem is an important o Modern Enterprise Dara Requirements consideration. o Essential Characteristics of Streaming Technology o Stream Processing Technologies  The business and technical drivers for streaming o Kafka Stream Processing Basics data adoption o Replacing ETL with CDC and Streaming  Data pipeline processing patterns and the o Streaming First Architecture advanced patterns that are possible with o Moving to Streaming First Architecture streaming

 Use case patterns and a variety of use cases for streaming data  Five kinds of Change Data Capture (CDC) and the o Introduction to Apache Kafka strengths and weaknesses of each o Kafka Predecessors  The concept and applications of streaming first architecture  Kafka architecture and essential components o Apache Kafka Architecture  Kafka data and process flow o Essential Components Part 1-2  The roles and functions of Kafka broker, data o Data and Process Flow Part 1-2 producers, and data consumers o Kafka Record – Key Components  Cluster management, data connections, and APIs o Kafka Broker Overview with Kafka  Integrating streaming into the data ecosystem o Cluster Management with Apache

 Data and analytics leaders and managers  Data and analytics architects o Data Streaming Use Cases  Data scientists o Streaming Integration with Data Ecosystems  Data engineers o Kafka Alternatives from Cloud Service  Data governance professionals who need to Providers understand the opportunities and implications of o Amazon Kinesis streaming data o Azure Event Hub  Anyone with a desire to know how streaming is o Google Cloud PUB/SUB changing the data management landscape Full outline: https://ecm.elearningcurve.com/Online_Data_Science 18 _Course_p/sc-08-a.htm www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Analytical Modeling, Evaluation, and o Evaluation Deployment Best Practices o Bias/Variance Tradeoff o Train and Test Sets o Assessment of Results o Hold-out Cross Validation o K-fold Cross Validation Method o Regression – Mean Squared Error o Linear Regression Confidence and Prediction Intervals o Logistic Regression – Significance Test o Classification Accuracy o Classification Accuracy – Other Measures o Prediction Error Methods o ROC Curve o Evaluation – Customer Acceptance consideration.

o Deployment

o Deployment – Working Software  How to match the problem to the analytical model  How to choose a relevant algorithm o Data Pipelines  How to evaluate models for the best option o Data Pipelines – Part of Deployment: Part 1 –  How to prepare for model deployment 2  How to monitor models o Deployment Operationalization  How to support model operations o Monitoring Models – Dashboard o Life of the Model

 Business analysts  Data analysts o Model Operations: Part 1 – 2  Data scientists o Data Ingestion  Project leads  Business subject matter experts that support data o Data Storage science projects o Data Integration and Synthesis o Data Visualization o Model Accuracy Course Outline o Model Retraining o Model Retiring o Machine Learning in Action

 The Data Science Process o Machine Learning Metrics  Types of Data Science Projects o Metrics for Supervised Learning  Modeling o Classification Model Metrics  Project Type and Maturity o Normalized Discount Cumulative Gain (NDCG)  Data Science Starting Point o Discount Cumulative Gain (DCG)  Other Modeling Considerations o Normalized Discount Cumulative Gain (NDCG) – Example o Root Mean Squared Error o Quantities of Error  Data Science Framework  Approaches  Techniques  Algorithms  Anomaly Detection

19 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Data Strategy for the Age of Big Data Course Outline

o Overview of Data Strategy o Big Data o What is Data Strategy o Data Strategy and Data Governance Framework o Components of Data Strategy o Connecting Business and Technology o Business Case Study o Business Model Innovation o What is Innovation? o Types of Innovation o Learn About Business Model

o Business Model o Learn About Business Model o Business Model Canvas o Case Study: Uber  A framework to define and design big data o Exercise 1 strategy  Concepts for alignment of business strategy and o Business Strategy big data strategy o Summary  Concepts and components of the Business Model Canvas  How to apply Value Chain Analysis o Two Big Ideas in Strategy  How big data creates opportunities for business o Executing Business Strategies Value Chain transformation Analysis o Value-Chain Analysis: Michael Porter o Value-Chain – Michael Porter  Chief Data Officers o Resource-based Views of the Firm  Chief Analytics Officers  VP, Director, Managers of Data and Analytics o The Balance Scorecard  Data Architects o Project Evaluation Criteria o Measurable vs. Unmeasurable  Data Scientists o Exercise 2 o Summary

o Data Strategy Part 1 o Characteristics of Data Strategy o Data Strategy Framework o Road Blocks o Strategy and RoadMap o Success Program and Optimizing Outcomes o Module Summary

20 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

OUR INSTRUCTORS

Natasha Balac Jennifer Leo Natasha Balac currently directs the Dr. Jennifer Leo is the Director of The Steadward Interdisciplinary Center for Data Science (ICData) Centre for Personal & Physical Achievement at at Calit2/Qualcomm Institute, and lectures in the the University of Alberta, in Edmonton, Alberta, area of big data and data science. She has led the Canada. With over 15 years of experience Predictive Analytics Center of Excellence and conducting research and evaluation in collaborated on numerous government and community based settings, Jennifer brings research projects in the areas of analytics and insight into what it means to conduct research visualization. beyond academia.

Deanne Larson Mark Peco Dr. Larson is an active practitioner and academic Mark Peco is an experienced consultant, focusing on business intelligence and data educator, practitioner and manager in the fields warehousing with over 20 years of experience. of Business Intelligence and Process She completed her doctorate in information Improvement. He provides vision and leadership technology leadership. She PMP and CBIP to projects operating and creating solutions at certifications. the intersection of Business and Technology.

William McKnight Asha Saxena William is president of McKnight Consulting Asha Saxena is a strategic, innovative leader Group, which includes service lines of Master with a proven track record of building successful Data Management, IT assessment, Big Data, tech businesses for the last 25 years. With a Columnar Databases, Data Warehousing, and strong academic background, creative problem- Business Intelligence. He functions as Strategist, solving skills, and an effective management Lead Enterprise Information Architect, and style. She has been instrumental in building Program Manager for sites worldwide. business models for success.

Kevin Petrie Dave Wells Kevin is Senior Director of Product Marketing at Dave Wells is an educator, and industry analyst Attunity. He has written the book Streaming dedicated to building meaningful connections Change Data Capture: A Foundation for Modern throughout the path from data to business value. Data Architectures and is a contributing writer for He fills many roles including instructor and Eckerson Group. A frequent public speaker and Education Director for eLearningCurve, accomplished writer, Kevin has nearly a decade of Research Consultant at Eckerson Group, and experience in data management and analytics. faculty member at TDWI.

Krish Krishnan Krishnan is an international authority on unstructured data, social analytics and big data, text mining, and text analytics. An innovator and solution expert, he is recognized for his work in data warehouse architectures and is an acknowledged expert in performance tuning of complex database and data warehouse platforms.

21 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

•

WHAT OUR CUSTOMERS ARE SAYING…

The courses are well laid out, build on each other, and are rich in practical content and advice.

-- Steve Lutter, CIMP Data Quality, DM and Metadata, IM Foundations, Business Intelligence, Data Governance, MDM, United States

It is evident that a thorough and considerable effort has gone into the preparation of this program.

-- Alfredo Parga O’Sullivan, CIMP Ex Data Quality, Ireland

The ability to take the courses at my own pace and at a time suitable for me was of great help.

-- Geeta Jegamathi, CIMP Data Quality, India

22 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

Director, Enterprise Solutions

Director, Marketing

Director, Education

Director, Technology

Customer Support

RESELLERS

DENMARK

SOUTH AFRICA & SUB-SAHARAN AFRICA

23 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659

DATA SCIENCE COURSE PRICING

Packages

CIMP Data Science Package $1,795.00

CIMP Ex Data Science Package $2,695.00 CIMP All Course Access License $4,995.00 CIMP All Course Access License w/ Exam Package $5,990.00

Individual Courses

Artificial Intelligence Fundamentals $485.00 Analytical Modeling, Evaluation, and Deployment Best $305.00 Practices Data Science Fundamentals $480.00 Framing & Planning Data Science Projects $305.00 Data Understanding and Preparation for Data Science $320.00 Data Mining Concepts & Techniques $295.00 Hadoop Fundamentals $515.00 Putting the Science in Data Science $305.00 Diagnostic Analytics Using Statistical Process Control $385.00 Prescriptive Analytics Using Simulation Models $415.00 Data Mining in R $345.00 Introduction to NoSQL $360.00 Streaming Data Concepts, Applications, Technologies $305.00 Data Strategy for the Age of Big Data $300.00

Exams

Exam for each course…...... $80.00

Enterprise Discounts We offer discounts to Enterprise customers who purchase in bulk. Please contact us for more information.

About eLearningCurve eLearningCurve offers comprehensive online education programs in various disciplines of information management. With eLearningCurve, you can take the courses you need when you need them from any place at any time. Study at your own pace, listen to the material many times, and test your knowledge through online exams to ensure maximum information comprehension and retention. eLearningCurve also offers two robust certification programs: CIMP & CDS. Certified Information Management Professional (CIMP) builds upon education to certify knowledge and understanding of information management. Certified Data Steward (CDS) is a role-based certification designed for the fast growing data stewardship profession.

Finally, eLearningCurve’s Enterprise Program is a flexible, scalable, cost- effective solution for teams and enterprises.

24 www.elearningcurve.com ▪ [email protected] ▪ 1.630.242.1659