Invariant Objects Data Set

Andrew Kusiak

Intelligent Systems Laboratory F5 D 2139 Seamans Center 1 0 1 2 7.271 0 P The University of Iowa 2 1 0 1 14.21 2 Z 3 1 0 3 21.023 1 N Iowa City, Iowa 52242 - 1527 4 0 1 0 12.217 2 Z 5 0 0 3 15.031 1 N Tel: 319 - 335 5934 Fax: 319 - 335 5669 6 1 1 2 11.342 0 P [email protected] http://www.icaen.uiowa.edu/~ankusiak

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Classification Quality Classification Quality

• Classification quality (CQ) of a feature set F1 F2 F3 F4 F5 D is the ratio of the number of objects in the 1 0 1 2 7.271 0 P lower approximation and the total number 2 1 0 1 14.21 2 Z 3 1 0 3 21.023 1 N of objects in the data set. 4 0 1 0 12.217 2 Z 5 0 0 3 15.031 1 N 6 1 1 2 11.342 0 P

CQ(F1) = 0, CQ(F2) = 0, CQ(F3) = 1, CQ(F4) = 1, CQ(F5) = 1

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Cross-validation Decision Rules

N Z P None N 2 0 0 0 Z 1 0 1 0 Rule 1. (F3 = 3) => (D = N); [2, 2, 100.00%, 100.00%][2, 0, 0] P 0 0 2 0 [{3, 5}] Rule 2. (F3 in {1, 0}) => (D = Z); [2, 2, 100.00%, 100.00%] Correct Incorrect None [0, 2, 0][{2, 4}] N 100.00 0.00 0.00 Rule 3. (F3 = 2) => (D = P); [2, 2, 100.00%, 100.00%][0, 0, 2] Z 0.00 100.00 0.00 [ {1, 6}] P 100.00 0.00 0.00 Average 66.67% 33.33% 0.00%

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

1 Data Set Variations (1) Data Set Variations (2)

• DS3 = DS1 with the feature sequence • DS1 = "As-is" data set F2_F3 and the feature F4 discretized into • DS2 = DS1 with feature F5 removed three intervals int 1 = (inf, 11.780), int 2 = [11.780, 14.621), and int 3 = [14.621, +inf)

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Data Set Variations (3) Data Set DS3

F1 F2_F3 F4 F5 D • DS4 = DS3 with features F2, F3, and F5 1 0 1_2 0 0 P removed . 2 1 0_1 1 2 Z 3 1 0_3 2 1 N • DS5 = DS4 with feature F5 removed. 4 0 1_0 1 2 Z 5 0 0_3 2 1 N 6 1 1_2 0 0 P

CQ(F1) = 0, CQ(F2_F3) = 1, CQ(F4) = 1, CQ(F5) = 1

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Decision Rules Cross-Validation

• Rule 1. (F2_3 = 0_3) => (D = N); [2, 2, N Z P None 100.00%, 100.00%][2, 0, 0][{3, 5}] N 2 0 0 0 Z 1 0 0 1 P 0 0 2 0 • Rule 2. (F2_3 in {0_1, 1_0}) => (D = Z); [2, 2, 100.00%, 100.00%][0, 2, 0][{2, 4}] Correct Incorrect None N 100.00 0.00 0.00 Z 0.00 50.00 50.00 • Rule 3. (F2_3 = 1_2) => (D = P); [2, 2, P 100.00 0.00 0.00 100.00%, 100.00%][0, 0, 2] [{1, 6}] Average 66.67 % 16.67% 16.67%

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

2 Modified Data Set Rule set

• Rule 1. (F2_3 = 0_3) => (D = N); [2, 2, F1 F2_F3 F4 F5 D 100.00%, 100.00%][2, 0, 0][3, 5] 1 0 1_2 0 0 P • Rule 2. (F5 = 2) => (D = Z); [2, 2, 100.00%, 2 1 0_1 1 2 Z 100.00%][0, 2, 0][2, 4] 3 1 0_3 2 1 N 4 0 1_0 1 2 Z • Rule 3. (F2_3 = 1_2) => (D = P); [1, 1, 5 0 0_3 2 1 N 100.00%, 100.00%][0, 0, 1][1] 6 1 1_2 0 0 P

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Cross-Validation Results Invariant Objects

No. DS1 DS2 DS3 DS4 DS5 D 1+++++N 2--?-?Z No. F1 F2_F3 Cont F4 Int F4 F5 D 3 + + + + + P 310_3 21.023 3 1 P 4- - - ? -Z 500_3 15.031 3 1 P 5 + + + + + P 6+++++N

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Optimization Approach The Model n = the number of features (used to compute ΣΣ Σ (1) Max ½ d ijy ij + c ix i distances dij between rules i and j) ij i m = the number of rules Σ x = 1 for k = 1, …, q (2) q = the number of decision classes i i ∈ S k = the number of classes (decisions) k ≤ (3) Sk = set of rules of class k x i y ij for i, j = 1, …, m ci = support of rule i (4) x i = 0, 1 for j = 1, …, m dij = distance between rules i and j xi = 1 if rule i is selected, otherwise xi = 0 (5) y ij = 0, 1 for i, j = 1, …, m yij = 1 if rules i and j are selected, otherwise yij = 0

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

3 Example Solution Rule - feature matrix

Rule F23 F30 F31 F32 F33 F34 F37 F38 F41 F42 F43 F44 Decision Support R1 {3,4,5,7} {1,3,4,7} {1,5,7} N 11 R2 {0,5} {0,5,6} {0,6,7} {1,2,4,6} {0,2,4} N 7 R3 {0,3,5,6} {0,2,3,5} {0,4} Z 9 R4 {0,1,3,5} {0,2,4,6} Z 14 R5 {0,2,5} {0,6} Z6 R6 6 {1,5,7} Z 2 Rules R1 and R10 have been selected. R7 5 Z 2 R8 {1,3,4,6} {2,4,7} {1,3,4,6} P 21 R9 {1,6} P 9 R10 {3,5} {2,4} P 10 R11 {2,6} P11 R12 4 {0,2,5} {1,4,6} P1 R13 3 {1,4} {0,7} P 6 R14 1P1

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

References

A. Kusiak, Selection of Invariant Objects with a Data Mining Approach, IEEE Transactions on Electronics Packaging Manufacturing, Vol. 28, No. 2, 2005, pp. 187- 196.

A. Kusiak, A Data Mining Approach for Generation of Control Signatures, ASME Transactions: Journal of Manufacturing Science and Engineering, Vol. 124, No. 4, 2002, pp. 923-926.

The University of Iowa Intelligent Systems Laboratory

4