Below Is a Simple Dataset About Whether to Go Skiing Or Not

Total Page:16

File Type:pdf, Size:1020Kb

Below Is a Simple Dataset About Whether to Go Skiing Or Not

Assignment 7

Overview Below is a small dataset concerning whether to go skiing or not. The decision to go skiing depends on the attributes snow, weather, season, and physical condition, as shown in the table below

snow weather season physical condition go skiing sticky foggy low rested no fresh sunny low injured no fresh sunny low rested yes fresh sunny high rested yes fresh sunny mid rested yes froste windy high tired no d sticky sunny low rested yes froste foggy mid rested no d fresh windy low rested yes fresh windy low rested yes fresh foggy low rested yes fresh foggy low rested yes sticky sunny mid rested yes froste foggy low injured no d

Questions 1. Apply Naive Bayes as the probabilistic mining algorithm on the dataset above and create a table with counts and probabilities. The following calculations based on the smaller dataset below are provided as an example:

snow weather go skiing fresh foggy no sticky windy no sticky sunny yes fresh windy yes fresh foggy yes froste sunny no d

This small data table leads to the following tables with counts and probabilities: snow weather go skiing

yes no yes no yes no 3 3 fresh 2 1 sunny 1 1 sticky 1 1 foggy 1 1 froste windy 1 1 0 1 d

yes no yes no yes no 3/6 3/6 fresh 2/3 1/3 sunny 1/3 1/3 sticky 1/3 1/3 foggy 1/3 1/3 froste windy 1/3 1/3 0/3 1/3 d

If we want to classify the following new instance "snow=fresh and weather=sunny", we calculate the likelihood of "go skiing=yes" in the following way: likelihood of yes = 2/3 * 1/3 * 3/6 = 6/54 = 1/9 likelihood of no = 1/3 * 1/3 * 3/6 = 3/54 = 1/18

(we assume that all attributes are equally important and independent - that's why Naive Bayes is called Naive) probability of yes = (1/9)/((1/9)+(1/18)) = 2/3 probability of no = (1/18)/((1/18)+(1/9)) = 1/3 Therefore, the probability is 33% to not go skiing based on the information given in this small example.

2. Explain in your own words the terms: Naive Bayes Classifier,and Bayesian Belief Network.

3. Draw the Bayesian Belief Network that represents the conditional independence assumptions of the Naive Bayes Classifier for the skiing problem. Hint: The Naive Bayes conditional independence assumption is P(a1, a2, …, an|vi) = P(a1|vi) * P(a2|vi) * … * P(an|vi), where vi is a class label and a1, a2, …, an are dataset attributes.

Recommended publications