Identifying Vulnerabilities Using Internet-Wide Scanning Data
Total Page:16
File Type:pdf, Size:1020Kb
Identifying Vulnerabilities Using Internet-wide Scanning Data Jamie O’Hare, Rich Macfarlane, Owen Lo School of Computing Edinburgh Napier University Edinburgh, United Kingdom 40168785, r.macfarlane, [email protected] Abstract—Internet-wide scanning projects such as Shodan and of service, through the considerable time and resources re- Censys, scan the Internet and collect active reconnaissance results quired to perform the scans. Due to this potential issue, as for online devices. Access to this information is provided through well as specific legal requirements, vulnerability assessment associated websites. The Internet-wide scanning data can be used to identify devices and services which are exposed on the Internet. tools typically require permission from the target organization It is possible to identify services as being susceptible to known- before being used. vulnerabilities by analysing the data. Analysing this information The known vulnerabilities identified by these tools are is classed as passive reconnaissance, as the target devices are associated with a specific Common Vulnerabilities and Ex- not being directly communicated with. This paper goes on to posure (CVE), which highlights a vulnerability for a specific define this as contactless active reconnaissance. The vulnerability identification functionality in these Internet-wide scanning tools is service. A CVE entry contains information associated with currently limited to a small number of high profile vulnerabilities. the vulnerability including a Common Platform Enumeration This work looks towards extending these features through the (CPE) and a Common Vulnerability Scoring System (CVSS). creation of a tool Scout which combines data from Censys The CPE ties a CVE to a specific product and version, while and the National Vulnerability Database to passively identify the CVSS provides an impact score. These data feeds are potential vulnerabilities. This is possible by analysing Common Platform Enumerations and associated Common Vulnerability all stored as parts of the National Vulnerability Database and Exposures. Through this novel approach, active vulnerability (NVD), a public repository managed by the National Institute scanning results can be gained, while mitigating the associated of Standards and Technology. The NVD includes several sep- issues of active scanning, such as possible disruption to the arate data feeds including checklist references, software flaws, target network and devices. In initial experiments performed on misconfigurations, product names and impact metrics. Since its 2571 services across 7 local academic intuitions, 12967 potential known-vulnerabilities were identified. More focused experiments inception in 1997, the NVD has published information about to evaluate the results and compare accuracy with industry more than 100,000 vulnerabilities and is the standard reference standard vulnerability assessment tools were carried out and for vulnerabilities. Scout was found to successfully identify vulnerabilities with Another way to identify known vulnerabilities is through an effectiveness score of up to 74 percent when compared to the use of Internet-wide scanning projects such as Shodan and OpenVAS. Index Terms—Scout, Censys, Vulnerability Assessment, Censys [5], [6]. These tools collate lightweight active recon- Internet-wide, Scanning, Passive, Contactless, National Vulner- naissance results from services operating on publicly available ability Database IP addresses, made available to users through their respective websites and APIs. This data is collected by crawlers which I. INTRODUCTION scan the entire IPv4 address range across many ports gathering Gartner believes that by 2020, 99% of vulnerabilities ex- data on network services in a similar way to the vulnerability ploited will be known prior to their use [1]. The risk asso- assessment tools used in active reconnaissance [7], [8]. These ciated with known vulnerabilities cannot be overstated, with crawlers operate by sending only one or a small number of high profile incidents such as WannaCry and NotPetya both packets to each individual service. Service specific packets are exploiting the Eternal Blue vulnerability, which was both well sent to check for vulnerable service configurations, however known and which had patches developed to mitigate prior to currently, this vulnerability functionality provided by Shodan it being used in these attacks [2]. and Censys is rather limited [9]. Only a very small number of One way in which known vulnerabilities can be actively vulnerabilities are checked for and each has bespoke identifi- identified is through the use of vulnerability assessment tools cation functionality hard coded into the specific scanners [7], such as OpenVAS and Nessus [3], [4], typically as part of [10]. the active reconnaissance phase of a pentest enagagement. In A way in which this functionality could be expanded is an attempt to identify target systems and their exploitable through cross-referencing the more general service specific vulnerabilities, these tools aggressively scan networks and data provided by these Internet-wide scanning projects, against interrogate operating network services. This can be potentially vulnerability repositories such as the NVD. This piece of work disruptive to the target network causing issues such as denial looks at exploring and evaluating this type of vulnerability identification by building on the functionality of the Censys Internet-wide scanning project. This paper is organized as follows. The methodology in- cludes the design decisions and their impact on implemen- tation. Initial Validation presents an applicable use case. Experiments and Results describe the topology and design of the experiments conducted along with the corresponding results. Analysis and Evaluation examine and commentates on the results of the conducted experiment results. Finally, the Conclusion closes this paper, providing the outcomes. II. METHODOLOGY This section presents the design and subsequent implemen- tation process towards extending the vulnerability analysis functionality within an Internet-wide scanning project. This work resulted in a tool known as Scout, named due to the connotations of reconnaissance, exploration, and discovery. Previous work has defined Contactless Reconnaissance when using the output of Internet search engines such as Google [11] to limit the contact with target systems, and this work extends Fig. 1. Approach Comparison this with the use of the Censys Internet-wide scanning cached information and associated NVD vulnerability data. Thus the term Contactless Active Reconnaissance has been proposed Beijing. Despite being released in July 2013, ZoomEye does for this type of reconnaissance using Internet-wide subject- not seem to have been discussed much literature. From the specific search engines which index and make available active little research available, most of it appears to be not transcribed reconnaissance results [9], [11]. A graphical representation of into English yet. There is also seems to be a lack of detailed the comparison between the approach outlined above to a more documentation. Due to these factors, ZoomEye was not be standard vulnerability assessment approach can be seen in Fig. considered suitable. When deciding which scanning project to 1. Fig. 1 illustrates a difference between these approaches utilize, the determining factors were the value and format of can be highlighted. This comparison was originally performed the data provided. There are several factors which impact the by Simon, Moucha and Keller however, latency was not value of data coming from an Internet-scanning project, these considered [11]. This latency can vary across services across are age, bias, and comprehensiveness. Internet-wide scanning projects. The standard active approach 1) Age: Shodan's scans perpetually across a massive num- interacts with the target receiving immediate feedback through ber of services in the IPv4 address range, indexing 550 million their correspondence. While contactless active approach does services a month [8], [13]. Shodan scans are performed in not directly interact with the target and therefore, inherits a a blend of horizontal and vertical scanning as numerous latency which varies. services are scanned for, across the Internet in a random order This comparison will serve as the basis for the evaluation [14]. Matherly comments that the scan results are continually of Scout. used to update the underlying database, however, research suggests otherwise. In research conducted by Bodenhiem, A. Design Butts, Dunlap, and Mullins, they noted a varying delay in Scout has 3 distinct operations, gathering service scanning scan results appearing on the web interface of Shodan, a data, associating possible vulnerabilities, and presentation of delay varying from 1 day to 19 days [15]. This finding results. The first stage of gathering Internet-wide scanning has impacted the methodology employed in related research data can differ considerably depending on the service chosen. [16]. Censys’ scans horizontally with scans scheduled daily, There are three main Internet-wide scanning projects, Shodan, biweekly and weekly depending on the service [14]. These Censys and ZoomEye [5], [6], [12]. The oldest, most popular scans are performed over a 24-hour period, even though the and well-known internet scanning project is Shodan. Founded underlying ZMap technology can perform this faster [7]. This in 2009 by John Matherly, it is infamous