ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies Data Security in Local Network Using Distributed N.Kohila 1, R.Gowthami 2, 1Assistant Professor, Department of Computer Science and Computer Applications, 2M.Phil full time research Scholar, Department of Computer Science, 1,2* Vivekananda College of Arts and Sciences for women,Namakkal,TamilNAdu,India [email protected] [email protected]

Abstract— The firewall is one of the central technologies allowing read, "We are currently under attack from an Internet VIRUS! high-level access control to organization networks. Packet It has hit Berkeley, UC San Diego, Lawrence Livermore, matching in firewalls involves matching on many fields from the Stanford, and NASA Ames." TCP and IP packet header. At least five fields (protocol number, The Morris Worm spread itself through multiple source and destination IP addresses, and ports) are involved in vulnerabilities in the machines of the time. Although it was the decision which rule applies to a given packet. With available bandwidth increasing rapidly, very efficient matching algorithms not malicious in intent, the Morris Worm was the first large need to be deployed in modern firewalls to ensure that the scale attack on Internet security; online community was firewall does not become a bottleneck Since firewalls need to neither expecting an attack nor prepared to deal with one. filter all the traffic crossing the network perimeter, they should The firewall is one of the central technologies allowing be able to sustain a very high throughput, or risk becoming a highlevel access control to organization networks. Packet bottleneck. Thus, algorithms from computational geometry can matching in firewalls involves matching on many fields from be applied. In this paper we consider a classical algorithm that the TCP and IP packet header. At least five fields (protocol we adapted to the firewall domain. We call the resulting number, source and destination IP addresses, and ports) are algorithm ―Geometric Efficient Matching‖ (GEM). The GEM involved in the decision which rule applies to a given packet. algorithm enjoys a logarithmic matching time performance. However, the algorithm’s theoretical worst-case space With available bandwidth increasing rapidly, very efficient complexity is O (n4) for a rule-base with n rules. Because of this matching algorithms need to be deployed in modern firewalls perceived high space complexity, GEM-like algorithms were to ensure that the firewall does not become a bottleneck. rejected as impractical by earlier works. Contrary to this Modern firewalls all use ―first match‖ semantics : The firewall conclusion, this paper shows that GEM is actually an excellent rules are numbered from 1 to n, and the firewall applies the choice. Based on statistics from real firewall rule-bases, we policy (e.g., pass or drop) associated with the first rule that created a Perimeter rules model that generates random, but non- matches a given packet. uniform, rule bases. We evaluated GEM via extensive simulation Firewall packet matching is reminiscent of the well studied using the Perimeter rules model. router packet matching problem. However, there are several

Keywords— Firewall, Protection, GEM, Virus, Bottleneck, TCP, crucial differences which make the problems quite different. Protocol, Intranets A firewall is a system or group of systems (router, proxy, or gateway) that implements a set of security rules to enforce I. INTRODUCTION access control between two networks to protect "inside" The term firewall originally referred to a wall intended to network from "outside" network. It may be a hardware device confine a fire or potential fire within a building. Later uses or a program running on a secure host computer. In refer to similar structures, such as the metal sheet separating either case, it must have at least two network interfaces, one the engine compartment of a vehicle or aircraft from the for the network it is intended to protect, and one for the passenger compartment. network it is exposed to. A firewall sits at the junction point or Firewall technology emerged in the late 1980s when the gateway between the two networks, usually a private network Internet was a fairly new technology in terms of its global use and a public network such as the Internet. and connectivity. The predecessors to firewalls for network Distributed firewalls are host-resident security software security were the routers used in the late 1980s: applications that protect the enterprise network's servers and Clifford Stoll's discovery of German spies tampering with end-user machines against unwanted intrusion. They offer the his system[3] advantage of filtering traffic from both the Internet and the Bill Cheswick's "Evening with Berferd" 1992 in which he internal network. This enables them to prevent hacking attacks set up a simple electronic "jail" to observe an attacker that originate from both the Internet and the internal network. In 1988, an employee at the NASA Ames Research Center This is important because the most costly and destructive in California sent a memo by email to his colleagues[4] that attacks still originate from within the organization. They are

IJSRCSAMS Volume 3, Issue 6 (November 2014) www.ijsrcsams.com

ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies like personal firewalls except they offer several important they cannot make more complex decisions based on what advantages like central management, logging, and in some stage communications between hosts have reached. cases, access-control granularity. These features are necessary Newer firewalls can filter traffic based on many packet to implement corporate security policies in larger enterprises. attributes like source IP address, source port, destination IP Policies can be defined and pushed out on an enterprise-wide address or port, destination service like WWW or FTP. They basis. can filter based on protocols, TTL values, netblock of First, unlike firewalls, routers use ―longest prefix match‖ originator, of the source, and many other attributes. semantics. Next, the firewall matching problem is 4- or 5- Commonly used packet filters on various versions of dimensional, whereas router matching is usually 1- or 2- are IPFilter (various), ipfw (FreeBSD/Mac OS X), NPF dimensional: A router typically matches only on IP addresses, (NetBSD), PF (OpenBSD, and some other BSDs), and does not look deeper, into the TCP or UDP packet / (). headers. Finally, major firewall vendors support rules that utilize IP address ranges, in addition to subnets or CIDR blocks: this is the case for Check Point and Juniper—the main exception is Cisco, that only supports individual IP addresses or subnets. Therefore, firewalls require their own special algorithms. A firewall is a system designed to prevent unauthorized access to or from a private network. Firewalls can be implemented in both hardware and software, or a combination of both. Firewalls are frequently used to prevent unauthorized Internet users from accessing private networks connected to the Internet, especially intranets. All messages entering or leaving the intranet pass through the firewall, which examines each message and blocks those that do not meet the specified security criteria. Figure 1 Screened Host Firewall II TYPES A. Application-layer There are different types of firewalls depending on where Application-layer firewalls work on the application level of the communication is taking place, where the communication the TCP/IP stack (i.e., all browser traffic, or all telnet or ftp is intercepted and the state that is being traced. traffic), and may intercept all packets traveling to or from an III NETWORK LAYER OR PACKET FILTERS application. They block other packets (usually dropping them without acknowledgment to the sender). Network layer firewalls, also called packet filters, operate On inspecting all packets for improper content, firewalls at a relatively low level of the TCP/IP protocol stack, not can restrict or prevent outright the spread of networked allowing packets to pass through the firewall unless they computer worms and trojans. The additional inspection match the established rule set. The firewall administrator may criteria can add extra latency to the forwarding of packets to define the rules; or default rules may apply. The term "packet their destination. Application firewalls function by filter" originated in the context of BSD operating systems. determining whether a process should accept any given Network layer firewalls generally fall into two sub- connection. Application firewalls accomplish their function by categories, stateful and stateless. Stateful firewalls maintain hooking into socket calls to filter the connections between the context about active sessions, and use that "state information" application layer and the lower layers of the OSI model. to speed packet processing. Any existing network connection Application firewalls that hook into socket calls are also can be described by several properties, including source and referred to as socket filters. Application firewalls work much destination IP address, UDP or TCP ports, and the current like a packet filter but application filters apply filtering rules stage of the connection's lifetime (including session initiation, (allow/block) on a per process basis instead of filtering handshaking, data transfer, or completion connection). If a connections on a per port basis. Generally, prompts are used packet does not match an existing connection, it will be to define rules for processes that have not yet received a evaluated according to the ruleset for new connections. If a connection. It is rare to find application firewalls not packet matches an existing connection based on comparison combined or used in conjunction with a packet filter.[15] with the firewall's state table, it will be allowed to pass Also, application firewalls further filter connections by without further processing. examining the process ID of data packets against a ruleset for Stateless firewalls require less memory, and can be faster the local process involved in the data transmission. The extent for simple filters that require less time to filter than to look up of the filtering that occurs is defined by the provided ruleset. a session. They may also be necessary for filtering stateless Given the variety of software that exists, application firewalls network protocols that have no concept of a session. However, only have more complex rulesets for the standard services,

IJSRCSAMS Volume 3, Issue 6 (November 2014) www.ijsrcsams.com

ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies such as sharing services. These per process rulesets have the application proxy remains intact and properly limited efficacy in filtering every possible association that configured). Conversely, intruders may hijack a publicly may occur with other processes. Also, these per process reachable system and use it as a proxy for their own rulesets cannot defend against modification of the process via purposes; the proxy then masquerades as that system to exploitation, such as memory corruption exploits. Because of other internal machines. While use of internal address these limitations, application firewalls are beginning to be spaces enhances security, crackers may still employ supplanted by a new generation of application firewalls that methods such as IP spoofing to attempt to pass packets to rely on mandatory access control (MAC), also referred to as a target network. sa Proxies 1) C .Network address translation 1) A proxy server (running either on dedicated hardware or Firewalls often have network address translation (NAT) as software on a general-purpose machine) may act as a functionality, and the hosts protected behind a firewall firewall by responding to input packets (connection commonly have addresses in the "private address range", as requests, for example) in the manner of an application, defined in RFC 1918. Firewalls often have such functionality while blocking other packets. A proxy server is a gateway to hide the true address of protected hosts. Originally, the from one network to another for a specific network NAT function was developed to address the limited number of application, in the sense that it functions as a proxy on IPv4 routable addresses that could be used or assigned to behalf of the network user.[1] companies or individuals as well as reduce both the amount Proxies make tampering with an internal system from the and therefore cost of obtaining enough public addresses for external network more difficult and misuse of one internal every computer in an organization. Hiding the addresses of system would not necessarily cause a security breach protected devices has become an increasingly important exploitable from outside the firewall (as long as the defense against network reconnaissance. application proxy remains intact and properly configured). Conversely, intruders may hijack a publicly reachable system C. Understanding Packet-Filtering Firewalls and use it as a proxy for their own purposes; the proxy then Packet-filtering firewalls validate packets based on masquerades as that system to other internal machines. While protocol, source and/or destination IP addresses, source and/or use of internal address spaces enhances security, crackers may destination port numbers, time range, Differentiate Services still employ methods such as IP spoofing to attempt to pass Code Point (DSCP), type of service (ToS), and various other packets to a target network. parameters within the IP header. Packet filtering is generally accomplished using Access Control Lists (ACL) on routers or 2) C .Network address translation switches and are normally very fast, especially when Firewalls often have network address translation performed in an Application Specific Integrated Circuit (NAT) functionality, and the hosts protected behind a (ASIC). As traffic enters or exits an interface, ACLs are used firewall commonly have addresses in the "private address to match selected criteria and either permit or deny individual range", as defined in RFC 1918. Firewalls often have packets. such functionality to hide the true address of protected hosts. Originally, the NAT function was developed to D. Advantages address the limited number of IPv4 routable addresses The primary advantage of packet-filtering firewalls is that that could be used or assigned to companies or they are located in just about every device on the network. individuals as well as reduce both the amount and Routers, switches, wireless access points, Virtual Private therefore cost of obtaining enough public addresses for Network (VPN) concentrators, and so on may all have the every computer in an organization. Hiding the addresses capability of being a packet-filtering firewall. of protected devices has become an increasingly Routers from the very smallest home office to the largest important defense against network reconnaissance. service-provider devices inherently have the capability to Ndboxing control the flow of packets through the use of ACLs. Switches may use Routed Access-Control Lists (RACLs), B. Proxies which provide the capability to control traffic flow on a A proxy server (running either on dedicated hardware "routed" (Layer 3) interface; Port Access Control Lists or as software on a general-purpose machine) may act (PACL), which are assigned to a "switched" (Layer 2) as a firewall by responding to input packets interface; and VLAN Access Control Lists (VACLs), which (connection requests, for example) in the manner of an have the capability to control "switched" and/or "routed" application, while blocking other packets. A proxy packets on a VLAN. server is a gateway from one network to another for a Other networking devices may also have the power to specific network application, in the sense that it enforce traffic flow through the use of ACLs. Consult the functions as a proxy on behalf of the network user.[1] appropriate device documentation for details. Packet-filtering Proxies make tampering with an internal system from firewalls are most likely a part of your existing network. the external network more difficult and misuse of one These devices may not be the most feature rich, but when you internal system would not necessarily cause a security need to quickly implement a security policy to mitigate an breach exploitable from outside the firewall (as long as

IJSRCSAMS Volume 3, Issue 6 (November 2014) www.ijsrcsams.com

ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies attack, protect against infected devices, and so on, this may be the quickest solution to deploy. 2) E.Caveats The challenge with packet-filtering firewalls is that ACLs are static, and packet filtering has no visibility into the data portion of the IP packet. Tip - Packet-filtering firewalls do not have visibility into the payload. Because packet-filtering firewalls match only individual packets, this enables an individual with malicious intent, also known as a "hacker," "cracker," or "script kiddie," to easily circumvent your security (at least this device) by crafting packets, misrepresenting traffic using well-known port numbers, or tunneling traffic unsuspectingly within traffic allowed by the ACL rules. Developers of peer-to-peer Figure 2 Packet-Filtering Firewall sharing applications quickly learned that using TCP port An excellent use of packet filtering is on the border of 80 (www) would allow them unobstructed access through the your network, preventing spoofed traffic and private IP firewall. Note - The terms used to describe someone with malicious addresses (RFC 1918) from entering or exiting your network. intent may not be the same in all circles. In-depth ACL configuration is beyond the scope of this book,  A cracker refers to someone who "cracks" or breaks but a good reference is RFC 2827. into a network or computer, but can also define Hardware and Software Firewalls someone who "cracks" or circumvents software Firewalls can be either hardware or software but the ideal protection methods, such as keys. Generally it is not firewall configuration will consist of both. In addition to a term of endearment. limiting access to your computer and network, a firewall is  A hacker describes someone skilled in programming also useful for allowing remote access to a private network and who has an in-depth understanding of computers through secure authentication certificates and logins. and/or operating systems. This individual can use his Hardware firewalls can be purchased as a stand-alone or her knowledge for good (white-hat hacker) or evil product but are also typically found in broadband routers, and (black-hat hacker). Also, it describes my golf game. should be considered an important part of your system and  A script kiddie is someone who uses the code, network set-up. methods, or programs created by a hacker for Most hardware firewalls will have a minimum of four malicious intent. network ports to connect other computers, but for larger An example of a packet-filtering firewall, a router using a networks, business networking firewall solutions are traditional ACL in this case, access-list 100. Because the ACL available. is matching traffic destined for port 80, any flows destined to Software firewalls are installed on your computer (like any port 80, no matter what kind, will be allowed to pass software) and you can customize it; allowing you some through the router. Given the issues with packet filtering control over its function and protection features. A software and the fact that they're easy to circumvent, you may firewall will protect your computer from outside attempts to dismiss using them entirely. This would be a huge control or gain access your computer. mistake! Taking a holistic approach and using multiple devices to provide defense in depth is a much better

IV. EXISTING SYSTEM Existing algorithms implement the ―longest prefix match‖ semantics, using several different approaches. The IPL algorithm, which is based on results, divides the search space

IJSRCSAMS Volume 3, Issue 6 (November 2014) www.ijsrcsams.com

ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies into elementary intervals by different prefixes for each dimension, and finds the best (longest) match for each such interval. Firewall statefulness is commonly implemented by two separate search mechanisms: (i) a slow algorithm that implements the ―first match‖ semantics and compares a packet to all the rules, and (ii) a fast state lookup mechanism that checks whether a packet belongs to an existing open flow. In many firewalls, the slow algorithm is a naive linear search of the rule-base, while the state lookup mechanism uses a hash- table or a search-tree

4.1 Disadvantages Of Existing System There is no secure when the packet sending. Firewall not used before Time consuming is high.

V.PROPOSED SYSTEM In the field of computational geometry, proposed an algorithm which solves the point location problem for n non- overlapping d-dimensional hyper-rectangles, with a linear space requirement and O ((log n) (d−1)) search time. In our case, we have overlapping d-dimensional hyper- rectangles, since firewall rules can, and often do, overlap each other— making rules overlap is the method firewall administrators use to implement intersection and difference operations on sets of IP addresses or port numbers. These overlapping hyper-rectangles can be decomposed into non-overlapping hyper-rectangles—however, a moment‘s reflection shows that the number of resulting non-overlapping hyper-rectangles is (nd), thus the worst case complexity for 5.2.The Sub-Division of Space firewall rules is no better than that of GEM. In one dimension, each rule defines one range, which divides space into at most 3 parts. It is easy to see that n Advantages Of Proposed System: possibly overlapping rules define a subdivision of one- Packet filter firewall supports high speed. Packet filter dimensional space into at most (2n − 1) simple ranges. To firewall over configurations of simple network works with each simple range we can assign the number of the winner more speed. The thing behind this is that packet filter firewall rule. This is the first rule which covers the simple range. has the directly connection within external hosts & internal In d-dimensions, we pick one of the axes and project all the users. rules onto that axis, which gives us a reduction to the previous Packet filters take decisions on the basis of the each one-dimension case, with a subdivision of the one dimension packets, it doesn't take decision on the basis of the traffic into at most (2n − 1) simple ranges. The difference is that each context. It used to implement and enforce a security policy for simple range corresponds to a set of rules in (d − 1) communication between networks dimensions, called active rules. We continue to subdivide the (d −1) dimensional space recursively. We call each projection The Algorithm onto a new axis a level of the algorithm, thus for a 4- 5.1geometric Efficient Matching Algorithm dimensional space algorithm we have 4 levels of subdivisions. The firewall packet matching problem finds the first rule The last level is exactly a one-dimensional case—among all that matches a given packet on one or more fields from its the active rules, only the winner rule matters. header. Every rule consists of set of ranges [li, ri] for i = At this point we have a subdivision of d-dimensional space 1, . . . , d, where each range corresponds to the i-th field in a into simple hyper-rectangles, each corresponding to single packet header. The field values are in 0 ≤ li, ri ≤ Ui, where Ui winning rule. In Section 2.4 we shall see how to efficiently = 232 − 1 for IP addresses, Ui = 65535 for port numbers, and create this subdivision of d-dimensional space, and how it Ui = 255 for ICMP message type or code. Table 1 lists the translates into an efficient search structure. header fields we use (the port fields can double as the message type and code for ICMP packets). For notation convenience 5.3 Dealing with the Protocol Field later on, we assign each of these fields a number. Before delving into the details of the search data structure,

IJSRCSAMS Volume 3, Issue 6 (November 2014) www.ijsrcsams.com

ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies we first consider the protocol header field. The protocol field Search time required for d levels is O (d log x). The „*‟ is different from the other four fields: very few of the 256 search data structure have only 2 levels for the IPaddress , so possible values are in use, and it makes little sense to define a the search time is generally dominated by the search time for numerical ―range‖ of protocol values. This intrusion is levels of TCP search data structure validated by the data gathered from real firewalls : The only values we saw in the protocol field in actual firewall rules VI.FIREWALL RULE-BASE STATISTICS were those of specific protocols, plus t he wildcard, but never To get a better understanding of what real-life firewall rule a non-trivial range. - bases look like, we gathered statistics from firewall rule- Thus, the GEM algorithm only deals with single values in bases that were analyzed by the Lumeta (now AlgoSec) the protocol field, with special treatment for rules with as a Firewall Analyzer. The statistics are based on 19 rule-bases protocol. We preprocess the firewall rules into categories , by from enterprise firewalls (Cisco PIX and Check Point protocol, and build a separate search data structure for each FireWall-1) collected during 2001 and 2002. The rule-bases protocol (including a data structure for the protocol). The came from a variety of corporations from the financial, actual geometric search algorithm only deals with 4 fields. telecommunications, automotive, and pharmaceutical Now, a packet can only belong to one protocol—but it is industries. We analyzed a total of 8434 rules. also affected by protocol rules. Thus every packet needs to be It shows the distribution of protocols in the rules we searched twice: once in its own protocol's data structure, and analyzed. The data shows that 75% of rules from typical once in the structure. Each search yields a candidate winner firewall rule-bases match TCP, and a total of 93% match TCP, rule.3. We take the action determined by the UDP or ICMP. Of these the most important is clearly TCP. candidate with the lower number. Therefore, we concentrate on these protocols in the rest of paper. In our problem context, these protocols are the most difficult for evaluation since they imply a 4-dimensional 5.4 GEM Search Data structure space. The search data structure for GEM algorithm has three The same table shows the distribution of TCP source and parts. The first part contains array of pointers for each destination port numbers. We can clearly see that the source protocol number with cell which contains „*‟ protocol. The port number is rarely specified: 98% of the rules have a second and third part is built separately for each protocol. The wildcard `*' in the source port. This makes sense because both second part of the GEM search data structure contains PIX and FireWall-1 are stateful firewalls that do not need to protocol database header which generally consists of perform source-port filtering to allow return traffic through information which is about order of data structure levels. The the firewall—and source port data is generally unreliable fields of packet header are checked in an order and in same because it is usually under the control of the attacker. order it is being encoded as 4 tuple of field numbers with the On the other hand, the TCP destination port is usually help of numbering shown in Table1. The protocol database specified precisely. The vast majority of rules specified a header has pointer to the first level as well as pointer to a single port number, but 4% allowed a range of ports, and the number of simple ranges in that level. The levels of data ranges tended to be quite large. Common ranges are ―all high structure are represented in the third part of the GEM search ports‖ (1024–65535) and ―X11 ports‖ (6000-6003). The single data structure. Each level is nothing but the set of nodes and port numbers we encountered were distributed among some each node is an array. Also each array cell defines a simple 200 numbers, the most popular of which are shown in Table 2: range, and also specifies a pointer to next node on next level. these correspond to the HTTP, FTP, Telnet, HTTPS, HTTP- The last level contains the simple range information which Proxy, and NetBIOS services. consists of the number of winner rule.

VII. THE SIMULATION STUDY 5.5 Search Algorithm The packet header field consists of 4 fields: source IP 7.1 The Random Rules Simulation address, destination IP address, protocol number, port number. The protocol number field is first checked and then to select a As the first step of our performance evaluation of GEM we protocol database header field we have to go to search data implemented and tested it in isolation. The GEM builds and structure. Binary search is applied on each and every level to search algorithms were implemented in C using Microsoft find the simple matching range level by level. The final level VC++ 6.0. The simulations were performed on a 733MHz will give us the desired result that is the number of the Pentium III PC with 256MB of RAM running the Windows matching rule. This searching procedure is repeated for „*‟ XP . protocol to find another matching winner rule. From these two We started by testing GEM using uniformly-generated we select one with having lower rule number. Binary search is rules: for every rule, each endpoint of each of the 4 fields (IP applied on an array of 2x entries, where x is maximum address ranges and port ranges) was selected uniformly at number of active rules. Two searches are carried out: one for random from its domain. We built the GEM data structure for packets protocol and one for search in‟*‟ data structure. increasing numbers of such rules and then used the resulting

IJSRCSAMS Volume 3, Issue 6 (November 2014) www.ijsrcsams.com

ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies structure to match randomly generated packets. The destination addresses for inbound rules are always On one hand, these early simulations showed us that the internal, belonging to the 10 internal class B subnets. 45% of search itself was indeed very fast: a single packet match took the rules have a randomly chosen individual internal IP around 1µsec, since it only required 4 executions of a binary address as a destination, modeling server machines. Another search in memory. 15% have a small random range: a range which completely On the other hand, we learned that the data structure size lies inside one of the internal class C networks. These ranges grew rapidly—and that the order of fields had little or no model clusters of servers and small classless subnets such as effect on this size. The problem was that since the ranges in '/27's and '/28's. Then, 30% of the rules have a complete class the rules were chosen uniformly, almost every pair of ranges C as a destination (i.e., a range of the form a.b.c.0 − (in every dimension) had a non-empty intersection. All these a.b.c.255). Finally, 10% allow access to a full class B. intersections produced a very fragmented space subdivision, Outbound rules. When we are modeling the outbound rules, and effectively exhibited the worst-case behavior in the data 90% of the rules have a destination IP address of ‗∗ ‘. 10% of structure size. We concluded that a more realistic rule model is the rules have either a specific address or a range in the needed. destination field, modeling a rule that restricts or allows access to some particular server or network. The source addresses for outbound rules are selected from the internal 7.2 The Perimeter Rules Model addresses with the frequencies shown in Table 1. Real firewall rule-bases have a large degree of structure. Services. The service field in the rules is selected similarly Thus, we hypothesized that realistic rule-bases rarely cause for both Inbound and Outbound rules. The service is selected worst-case behavior for the GEM algorithm. Furthermore, we uniformly at random from a collection of 100 services, whose wanted to test the effects of the field order o n the definitions were taken from real firewall rule-bases (recall performance of GEM on such rule-bases. For this purpose, we Table 2). Most of these services have individual destination built the Perimeter firewall rules model, and simulated the port numbers, however a few include port ranges, and one behavior of GEM on rule-bases generated in this model. service is the ‗∗ ‘ service. We allow a small rate of growth in the number of services by adding 2% of randomly generated services, where the destination port is randomly picked from 0 7.2.1 The Modeled Topology to 65535. The statistical distribution for rules in the Perimeter The model assumes a perimeter firewall with two ―sides‖: a model. An ‗∗‘ in the source IP address for Outbound rules protected network on the inside, and the Internet on the represents all IP addresses inside the internal network. outside. The inside network consists of 10 class B networks, insecure to allow many TCP services into large parts of the and the Internet consists of all other IP addresses. Thus, the internal networks [44]. Both considerations would cause more internal network contains 10 · 65536 possible IP addresses. In repetitions in IP addresses, and hence, reduce the number of reality, organizations that actually own 10 class B networks simple ranges, which would lead to smaller search data are quite rare. However, we used this assumption for two structures. Therefore, we believe our Perimeter model stresses reasons: the GEM algorithm more than real firewall rule-bases would. 1) Many organizations use private (RFC 1918) IP addresses internally, and export them via network address translation (NAT) on outbound traffic. Such organizations often use large subnets liberally, e.g., assign a 172.x.*.* class B subnet to each department. 2) Having a large internal subnet stresses the GEM algorithm since we pick random ranges from the internal ranges.

7.2.2 The Rules

The Perimeter rules model produces rules of two types: Inbound rules, that allow traffic from the Internet into the protected network, and Outbound rules, that allow traffic from the protected network out to the Internet. Each rule in the rule- base is constructed randomly according to the distribution for its type (Inbound or Outbound). Inbound rules. When we are modeling rules for inbound traffic, the source IP addresses are rarely specified in the rules, and 95% of the rules have as their source address. The remaining 5% have a range in their source address field, chosen uniformly at random from the Internet's IP addresses.

IJSRCSAMS Volume 3, Issue 6 (November 2014) www.ijsrcsams.com

ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies 7.2.3 The Inbound - Outbound Ratio Perimeter-model rules. We leave such a ―bake-off ‖ for future An additional parameter of our Perimeter model is the ratio work. As for GEM itself, we would like to explore the between the number of Inbound and Outbound rules. In order algorithm's behavior when using more than 4 fields, e.g., to determine the effect of this parameter, we ran the GEM matching on the TCP flags, meta data, interfaces, etc. The building algorithm on rule-bases with different ratios of main questions are: How best to encode the non-range fields? Inbound and Outbound rules. The difference between Will the space complexity still stay close to linear? What will homogeneous and mixed rule-bases can be up to a factor of 6 be the best order of fields to achieve the best space complexity? in size. In all subsequent tests we used an inbound-outbound Another direction to pursue is how GEM would perform with ratio of 50% - again, to stress the GEM algorithm. of IPv6, in which IP addresses have 128 bits.

7.2.4 The GEM-IPTABLES Implementation REFERENCES To evaluate GEM in a more realistic environment, we [1] F. Baboescu, S. Singh, and G. Varghese, ―Packet classification implemented the GEM algorithm and integrated it with the for core routers: Is there an alternative to cams,‖ in Proc. IEEE code of the Linux iptables firewall. We used Red Hat Linux 9 INFOCOM, 2003. (kernel version 2.4.18-8) and iptables v1.2.8. We incorporated [2]F. Baboescu and G. Varghese, ―Scalable packet classification,‖ in the GEM build algorithm into the user-space program iptables, Proc. ACM SIGCOMM, 2001, pp. 199–210. and the GEM search algorithm into the iptables kernel module. [3] N. Bar-Yosef and A. Wool, ―Remote algorithmic complexity at The built GEM database was transferred from user space to tacks against randomized hash tables,‖ in Proc. International the kernel using the mechanism already employed by iptables. Conference on Security and Cryptography (SECRYPT), Barcelona, We left the existing iptables linear search algorithm intact. The Spain, Jul. 2007, pp. 117–124. selection of linear or GEM search was controlled by a [4] M.M.Buddhikot, S. Suri, and M. Waldvogel, ―Space decomposition techniques for fast Layer-4 switching,‖ in Protocols command line switch. Since we wanted to be able to compare for High Speed Networks IV, Aug. 1999, pp. 25–41. GEM‘s performance to the regular iptables, we adopted the [5] W. R. Cheswick, S. M. Bellovin, and A. Rubin, Firewalls and iptables configuration language as our input. However, iptable Internet Security: Repelling the Wily Hacker, 2nd ed. Addison- does not support general ranges of IP addresses in the rules, Wesley, 2003. and only accepts subnets. Therefore, we modified our rule [6] M. Christiansen and E. Fleury, ―Using interval decision diagrams generation module to only produce subnets, e.g., instead of for packet were filtering,‖ 2002, http://www.cs.auc.dk/ generating a random IP range, we generate a random IP ∼fleury/publications.html. address and a random netmask that leaves the resulting subnet [7] E. Cohen and C. Lund, ―Packet classification in large ISPs : inside one class C network (recall Section 4.2). The modified Design and evaluation of decision tree classifiers,‖ in Proc. ACM SIGMETRICS. New York, NY, USA: ACM Press, 2005, pp. 73–84. rule generator output an iptables configuration script. [8] S. Crosby and D. Wallach, ―Denial of service via algorithmic complexity attacks,‖ in Proceedings of the 12th USENIX Security Symposium, August 2003, pp. 29–44. VIII.CONCLUSION [9] M. de Berg, M. van Kreveld, and M. Overmars, Computational We have seen that the GEM algorithm is an efficient and Geometry: Algorithms and Applications, 2nd ed. Springer-Verlag, practical algorithm for firewall packet matching. We 2000. implemented it successfully in the Linux kernel, and tested its [10] D. P. Dobkin and R. J. Lipton, ―Multidimensional searching packet-matching speeds on live traffic with realistic large rule- problems,‖ SIAM J. Comput., vol. 5, no. 2, pp. 181–186, 1976. [11 ] E. S. Al-Shaer and H. H. Hamed, ―Modeling and management bases. GEM's matching speed is far better than the naive of firewall policies,‖ IEEE Transactions on Network and Service linear search, and it is able to increase the throughput of Management,vol. 1, no. 1, 2004. iterables by an order of magnitude. On rule-bases generated [12] Y. Bartal, A. Mayer, K. Nissim, and A. Wool, ―Firmato: A according to realistic statistics, GEM's space complexity is novel firewall management toolkit,‖ ACM Trans. Comput. Syst., vol. well within the capabilities of modern hardware. Thus we 22, no. 4, pp. 380-420, 2004. believe that GEM may be a good candidate for use in firewall [13] P. Eronen and J. Zitting, ―An expert system for analyzing matching engines. firewall rules,‖ in Proceedings of the 6th Annual Workshop on We note that there are other algorithms that may well be Secure IT Systems,2001. candidates for software implementation in the kernel— [14] A. Mayer, A.Wool, and E. Ziskind, ―Offline firewall analysis,‖ International Journal of Information Security, 2005 specifically, we can point out the algorithms of Gupta and [15] W. J. Noonan and I. Dubrawsky, Firewall Fundamentals McKeown, Qiuet al. and Woo. We believe it should be quite (Indianapolis, IN, USA: Cisco Press,(2006). interesting to implement all of these algorithms and to test them on equal footing, using the same hardware, rule-bases, and traffic load. Furthermore, it would be interesting to do this comparison with real rule-bases, in addition to synthetic

IJSRCSAMS Volume 3, Issue 6 (November 2014) www.ijsrcsams.com