Cases of IP Address Spoofing and Web Spoofing —
Total Page:16
File Type:pdf, Size:1020Kb
3-3 StudiesonCountermeasuresforThwarting SpoofingAttacks—CasesofIPAddress SpoofingandWebSpoofing— MIYAMOTO Daisuke, HAZEYAMA Hiroaki, and KADOBAYASHI Youki This article intends to give case studies for of thwarting spoofing attack. Spoofing is widely used when attackers attempt to increase the success rate of their cybercrimes. In the context of Denial of Service (DoS) attacks, IP address spoofing is employed to camouflage the attackers’ location. In the context of social engineering, Web spoofing is used to persuade victims into giv- ing away personal information. A Web spoofing attacker creates a convincing but false copy of the legitimate enterprises’ web sites. The forged websites are also known as phishing sites. Our research group developed the algorithms, systems, and practices, all of which analyze cybercrimes that employ spoofing techniques. In order to thwart DoS attacks, we show the deployment scenario for IP traceback systems. IP traceback aims to locate attack source, regardless of the spoofed source IP addresses. Unfortunately, IP traceback requires that its sys- tems are widely deployed across the Internet. We argue the practical deployment scenario within Internets of China, Japan, and South Korea. We also develop a detection method for phishing sites. Currently, one of the most important research agenda to counter phishing is improving the accuracy for detecting phishing sites. Our approach, named HumanBoost, aims at improving the detection accuracy by utilizing Web users’ past trust decisions. Based on our subject experiments, we analyze the average detection accuracy of both HumanBoost and CANTINA. Keywords IP spoofing, Web spoofing, Internet emulation, IP traceback, Machine learning 1 Introduction measures. IP traceback aims to locate attack sources, regardless of the spoofed source IP DoS attacks exhaust the resources of addresses. Several IP traceback methods have remote hosts or networks that are otherwise been proposed [1]‒[3]; especially Source Path accessible to legitimate users. Especially, a Isolation Engine (SPIE)[3] is a feasible solu- flooding attack is the typical example of DoS tion for tracing individual attack packets. How- attacks. In the case of the flooding attack, the ever, SPIE requires that its systems are widely attackers often used the source IP address deployed across the Internet for enhancing spoofing technique. IP address spoofing can traceability. Traceability would decrease to a be defined as the intentional misrepresenta- minimum if there were only a few routers that tion of the source IP address in an IP packet in support SPIE. order to conceal the sender of the packet or to Web spoofing, also known as phishing, is impersonate another computing system. There- a form of identity theft in which the targets are fore, it is difficult to identify the actual source users rather than computer systems. A phish- of the attack packets using traditional counter- ing attacker attracts victims to a spoofed web- MIYAMOTO Daisuke et al. 99 site, a so-called phishing site, and attempts to ASes in a random manner. Castelucio et al. persuade them to provide their personal infor- mentioned that IP-TBS should be deployed mation. To deal with phishing attacks, a heu- along with intent[5]. In their proposed “strate- ristics-based detection method has begun to gic placement”, IP-TBSs should be deployed garner attention. A heuristic is an algorithm to in order of BGP neighbors. Hazeyama et al. identify phishing sites based on users’ experi- proposed to emulate the Internet topology[6] ence, and checks whether a site appears to be that resembles the current Internet topology a phishing site. Based on the detection result observed by CAIDA[7]. Hazeyama et al. also from each heuristic, the heuristic-based solu- introduced four types of deployment scenario tion calculates the likelihood of a site being a and estimated the traceability in the case of phishing site and compares the likelihood with Japanese Internet topology[8]. the defined discrimination threshold. However, Herein, we evaluated the traceability in current heuristic solutions are far from suitable China, Japan, and South Korea Internet using use due to the inaccuracy of detection. AS-level deployment. In our simulation, we In this paper, we describe our two con- created three types of emulated network topol- tributions for thwarting IP address spoofing ogies that resemble China, Japan, and South and Web spoofing, respectively. Chapter 2 Korea, respectively. We also prepare four types describes the deployment scenario for IP trace- of deployment scenario, namely, deployment in back systems. Chapter 3 figures out a detec- core ASes, leaf ASes, middle-class ASes, and tion method of phishing sites which aims at deployment in a random manner. improving the detection accuracy. Chapter 4 concludes our findings. 2.2 Simulationoftracebackdeployment Deployment of the IP-TBS should be con- 2 DeploymentscenarioforIP sidered along with the type of network topol- tracebacksystems ogy. Let us assume that a network has a star topology and that the IP-TBS is deployed in 2.1 Background the central node. In this case, all ASes and AS Several IP traceback methods have been links can be traced. proposed [1]‒[3]. Especially, Source Path Isola- In this section, we measure the trace- tion Engine (SPIE)[3] is a feasible solution for ability with deployment simulation. First, we tracing individual attack packets, however it explain the two classes of traceability used in requires large-scale deployment. our simulation. We then introduce three types Several researchers[4]‒[6] have proposed of outfitted Internet topologies, i.e., emulated autonomous system (AS)-level deployment to inter-AS topologies in China, Japan, and South facilitate global deployment of IP traceback Korea. We also introduce four types of deploy- systems (IP-TBSs). In this case, it is necessary ment scenarios. to deploy an IP-TBS into each AS instead of 2.2.1 Metrics implementing the SPIE in each router. Since We selected traceability as a performance the IP-TBS monitors the traffic between the metric. There are two principal classes of AS border routers and exchanges information traceability: packet traceability and path for tracing packets, the traceback client can traceability. Traceability in the first class is identify the source AS of the packets. that the system can specify the AS number However, the traceability can be easily where the issued IP packet is generated. The affected by the types of network topology and/ second class, path traceability, is that the sys- or the deployment scenario. Gong et al. simu- tem can designate the datalink of the AS bor- lated traceability by using three types of net- der. work topologies[4], but their deployment sce- To calculate the traceability, we refer to nario was the random placement; they selected the deployment case of previous study[8]. In 100 Journal of the National Institute of Information and Communications Technology Vol. 58 Nos. 3/4 2011 [8], the traceability had been defined in Equa- region. tion (1), where NS denotes the number of strict In our simulation, we employ the snapshot ASes, NL denotes the number of loose ASes, of CAIDA AS Relationship Database (ASRD) and N denotes the amount number of ASes in published on November 22, 2008. The data- the network topology. set can be summarized as shown in Table 1. Because there are loose ASes located outside (N + N ) S L (1) of each region, the number of deployment tar- Tpacket = N get AS and traceback target AS are different. A strict AS is an AS where an IP-TBS is Note that CAIDA extensively surveys AS rela- deployed. A loose AS is an AS where the IP- tionships, however, some types of BGP peering TBS is not deployed but the neighboring AS is styles such as private peering hinder the cre- a strict AS. Because of border tracking in typi- ation of a perfect ASRD. cal IP-TBS[9], Hazeyama et al. recognized that 2.2.3 Simulationscenarios a loose AS can be traced within the traceback We consider four types of deployment sce- architecture. narios as follows. Furthermore, the path traceability is shown in Equation (2) where LS denotes the num- S1: Deployment in order of core ASes ber of strict AS links, LL denotes the number In the ideal scenario, IP-TBSs are deployed of loose AS links, and L denotes the amount into the core ASes. Since the core ASes are number of AS links in the network topology. interconnected to many BGP neighbors, the traceback system can handle many ASes and (LS + LL ) Tlink = (2) AS links. Hence, traceability is expected to be L high even if the number of deployed traceback In a strict AS link, both peered ASes system is low. deploy the IP-TBS. In a loose AS link, on Although various criteria need to be satis- the other hand, an AS that deploy an IP-TBS fied for identifying the network core, we used is interconnected to another AS that does not the number of BGP peers as a metric. In this deploy an IP-TBS. scenario, the traceback system is deployed to 2.2.2 Networktopology ASes in the decreasing order of the number of We employ the emulated network topolo- established BGP peers. gies for several regions. Basically, every trace- back method can be used to construct attack S2: Deployment in order of leaf ASes paths. Hence, with the use of such traceback In this scenario, IP-TBSs are deployed to methods, communication privacy may be ASes in the increasing order of the number of affected. Because of diverse legal interpreta- established BGP peers. Since the IP-TBSs will tions of privacy, deployment across country trace fewer ASes and AS links in this case, border may not be easy. As the first step in traceability will be low. However, attacker deployment simulation, we selected the emu- lated topologies of China, Japan, and South Korea.