Advancing Cyber Security with a Semantic Path Merger Packet Classification Algorithm

ADVANCING CYBER SECURITY WITH A SEMANTIC PATH MERGER PACKET CLASSIFICATION ALGORITHM A Thesis Presented to The Academic Faculty by J. Lane Thames In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Electrical and Computer Engineering Georgia Institute of Technology December 2012 Copyright c 2012 by J. Lane Thames ADVANCING CYBER SECURITY WITH A SEMANTIC PATH MERGER PACKET CLASSIFICATION ALGORITHM Approved by: Dr. Randal Abler, Advisor Dr. Raghupathy Sivakumar School of Electrical and Computer School of Electrical and Computer Engineering Engineering Georgia Institute of Technology Georgia Institute of Technology Dr. Henry Owen Dr. Dirk Schaefer School of Electrical and Computer School of Mechanical Engineering Engineering Georgia Institute of Technology Georgia Institute of Technology Dr. George Riley Date Approved: September 27, 2012 School of Electrical and Computer Engineering Georgia Institute of Technology To Linda, my wife and best friend, forever plus infinity iii ACKNOWLEDGEMENTS First and foremost, I give special thanks to my advisor, Dr. Randal (Randy) Abler, who has been a great friend and mentor. I began working with Dr. Abler while I was an undergraduate student at Georgia Tech. He introduced me to the world of computer networking, information technology, and cyber security, and I am indefinitely grateful for the support, guidance, and knowledge he has given to me throughout the years. I would like to thank Dr. George Riley, Dr. Henry Owen, Dr. Raghupathy Sivakumar, and Dr. Dirk Schaefer for taking time from their busy schedules to serve as my thesis committee. Their feedback, support, and guidance played an invaluable role in my disser- tation research. I am truly honored to have studied under the guidance of my advisor and committee. I am grateful to all of my friends, classmates, and colleagues from Georgia Tech for all of their support and encouragement. I am thankful for each and every one of the wonderful Georgia Tech faculty and staff who have helped me during my time as a student. I would like to extend a special thanks to Ms. Marilou Mycko, Mrs. Pat Potter, and Dr. Daniela Staiculescu for all of their administrative support during my time as a graduate student. I would like to express my sincere gratitude to Dr. David Frost for being such a great friend and mentor to me. Special thanks goes to Gail Wells. Gail allowed me to work with her as an undergraduate teaching assistant. It was during my time as a teaching assistant that I discovered a burning desire to teach, and it was this desire that influenced my decision to pursue a graduate degree. Several years ago, Dr. Schaefer and my colleagues in the Systems Realization Laboratory invited me to do collaborative research with them. Our research together over the last few years has been very rewarding, and, for this, I am deeply grateful. Finally and with immeasurable gratitude, I want to thank my wife, Linda, whose love and encouragement made this achievement possible. iv TABLE OF CONTENTS DEDICATION....................................... iii ACKNOWLEDGEMENTS ................................ iv LIST OF TABLES .................................... viii LIST OF FIGURES.................................... x NOMENCLATURE .................................... xiii SUMMARY......................................... xvii I INTRODUCTION.................................. 1 II CONCEPTS..................................... 6 2.1 Terminology . .6 2.2 Firewalls . .7 2.3 Intrusion Detection . 11 2.4 Intrusion Prevention . 14 2.5 Packet Classification . 15 2.6 Summary . 18 III BACKGROUND................................... 19 3.1 Distributed security architectures and defense systems . 19 3.2 Software-Based Packet Classification Algorithms . 25 3.2.1 One Dimensional Packet Classification Algorithms . 26 3.2.2 Multidimensional Packet Classification Algorithms . 36 3.3 Hardware-based Packet Classification . 41 3.4 Cyberattack Detection Algorithms with Intelligent Systems . 46 3.5 Summary . 50 IV CYBERATTACK DEFENSE WITH THE DISTRIBUTED FIREWALL AND ACTIVE RESPONSE ARCHITECTURE..................... 51 4.1 Motivation . 51 4.2 Functional Characteristics of Distributed Network Defense Mechanisms . 54 4.2.1 The Firewall Collaboration Framework . 54 4.3 The Distributed Firewall and Active Response Architecture . 58 v 4.3.1 High-level Description . 58 4.3.2 Architectural Description . 62 4.3.3 Firewalls, Blacklisting, and the Packet Classification Problem . 68 4.4 Summary . 71 V A NOVEL THEORY OF SEMANTIC ASSOCIATION SYSTEMS . 72 5.1 A Compositional Computing Model of Semantic Association Systems . 73 5.2 Formal Model of the Semantic Association System . 75 5.3 Hypergraph Representation of the Semantic Association Network . 80 5.4 Characteristics and other Transformations of the Semantic Association System . 84 5.5 Summary . 87 VI HARDWARE-BASED MULTIDIMENSIONAL PACKET CLASSIFICATION WITH THE SEMANTIC PATH MERGER ALGORITHM ............... 90 6.1 The SPM multidimensional packet classification algorithm . 93 6.2 Simulation Methodology . 109 6.2.1 The Cacti Integrated Memory Simulator . 109 6.2.2 Simulation performance metrics and parameters . 110 6.3 Simulation Results . 111 6.4 Summary . 116 VII ADVANCED CYBERATTACK DETECTION WITH COMPUTATIONAL IN- TELLIGENCE SYSTEMS ............................. 118 7.1 Datasets and Performance Metrics for Evaluating Cyberattack Detection Systems . 119 7.1.1 Datasets . 119 7.1.2 Performance Metrics . 121 7.2 Cyberattack Detection with Hybrid Intelligent Systems . 122 7.2.1 The HBSOM Classification Algorithm . 123 7.2.2 Simulation Results . 127 7.3 Cyberattack Detection with Ensembles of Computational Intelligence Sys- tems....................................... 128 7.3.1 The NNO Classification Algorithm . 129 7.3.2 Simulation Results . 130 vi 7.4 Summary . 136 VIII CONCLUSION.................................... 138 APPENDIX A CACTI 6.5 CONFIGURATION EXAMPLE . 141 APPENDIX B CACTI 6.5 SIMULATION OUTPUT . 145 APPENDIX C CACTI 6.5 CONFIGURATION EXAMPLE WITH MCPAT . 148 APPENDIX D CACTI 6.5 SIMULATION OUTPUT WITH MCPAT . 152 APPENDIX E RAW CACTI SIMULATION DATA . 155 REFERENCES....................................... 159 VITA ............................................ 172 vii LIST OF TABLES 1 A real-world firewall filter-set. 10 2 An example of points, prefixes, and ranges used by packet filters. 10 3 An example of SNORT detection rules (filters). 13 4 An example of a simple 2-dimensional packet classifier. 17 5 A one-dimensional routing table of prefixes. 27 6 An example of prefix expansion. 30 7 Resolving prefix collisions with prefix capture. 31 8 Generic model for the data structure of a trie node. 31 9 Data represented by the trie of Figure 20. 45 10 The ideas of data, information, semantics, knowledge, and perspectives. 74 11 The meta-language of the semantic data structure. 75 12 Conceptual typing production rules of the semantic space. 77 13 The path matrix for the CPHG from Figure 30. 82 14 Theoretical space-time bounds for packet classification algorithms. 90 15 Line rates, packets per second, and classification rates. 91 16 Two dimensional packet filters. 94 17 Correspondences between the path merger processes and the semantic association network. 100 18 Parameters for a 3-dimensional packet classifier illustrating the construction of the SPM hypergrid. 102 19 A set of packet filters with parameters given by Table 18. 104 20 Partitioned path matrix corresponding to the colored paths from the semantic association network of Figure 41. 106 21 Metrics used to evaluate the performance of the SPM hardware algorithm. 111 22 Perspectives of the positive and negative classes for mixed detection binary classification. 122 23 Basis parameters of the performance metrics defined in Table 24. 122 24 Definitions and nomenclature of the performance metrics used to evaluate the proposed classification algorithms. 123 25 Classification results produced by simulations of the HBSOM and NBLN algorithms. 128 viii 26 EPO, TPO, and EDP data for w = (16; 32). 156 27 EPO, TPO, and EDP data for w = (64; 128). 157 28 Dynamic, static, and total power consumption data in units of Watts. 158 ix LIST OF FIGURES 1 The inside-outside topology-dependent firewall perspective. .8 2 A simple network system with a firewall and its filter-set. .9 3 The behavioral overlap of legitimate users and intruders. 12 4 A taxonomy of intrusion response systems [141]. 15 5 The geometric perspective of packet classification. 17 6 Primary functional components of the autonomous distributed firewall architecture. 20 7 The DIADEM firewall architecture. 21 8 The firewall network system for worm defense. 23 9 The cooperative intrusion traceback and response architecture. 24 10 Three-layer analysis within EMERALD. 25 11 A unibit trie for the prefixes from Table 5. 27 12 A multibit trie with stride of two for the prefixes from Table 5. 29 13 Pushing information fields to the leaf nodes in a trie. 32 14 Illustration of the path compression concept. 33 15 Level and path compression techniques of the LC-trie algorithm. 34 16 Creating disjoint ranges from the filter-set of Table 5. 35 17 Assigning the best matching prefixes to the basic intervals from Table 5. 36 18 A two dimensional source/destination prefix trie. 38 19 Block diagram of a content addressable memory and its main functional components. 41 20 Generic trie-to-hardware mapping techniques. 45 21 The FCF functional components. 56 22 The DFAR firewall-based blacklist enforcement model. 59 23 A trusted domain of administration. 60 24

Advancing Cyber Security with a Semantic Path Merger Packet Classification Algorithm

15-122: Principles of Imperative Computation, Fall 2010 Assignment 6: Trees and Secret Codes

Algorithms Documentation Release 1.0.0

Detecting Self-Conflicts for Business Action Rules

A Directory Index for Ext2

Affinity Hybrid Tree

Kernel Methods for Tree Structured Data

Effective Spatial Data Partitioning for Scalable Query Processing

15-122: Principles of Imperative Computation, Fall 2010 Assignment 6: Trees and Secret Codes

Huffman Trees

Addition of Ext4 Extent and Ext3 Htree DIR Read-Only Support in Netbsd

TABLEFS: Enhancing Metadata Efficiency in the Local File System

Algebras for Tree Algorithms