sensors

Letter The Design of Large Scale IP Address and Port Scanning Tool

Chao Yuan 1, Jinze Du 2,*, Min Yue 1 and Tao Ma 1

1 Institute of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, China; [email protected] (.Y.); [email protected] (M.Y.); [email protected] (T.M.) 2 School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730000, China * Correspondence: [email protected]; Tel.: +86-139-1925-0662

 Received: 2 July 2020; Accepted: 4 August 2020; Published: 8 August 2020 

Abstract: The control network is an important supporting environment for the control system of the heavy ion accelerator in Lanzhou (HIRFL). It is of great importance to maintain the accelerator system’s for the stable operation of the accelerator. With the rapid expansion of the network scale and the increasing complexity of accelerator system equipment, the security situation of the control network is becoming increasingly severe. Port scanning detection can effectively reduce the losses caused by viruses and Trojan horses. This article uses Go Concurrency Patterns, combined with transmission control protocol (TCP) full connection scanning and GIMP Toolkit (GTK) graphic display technology, to develop a tool called HIRFL Scanner. It can scan IP addresses in any range with any ports. This is a very fast, installation-free, cross-platform IP address and port scanning tool. Finally, a series of experiments show that the tool developed in this paper is much faster than the same type of software, and meets the expected development needs.

Keywords: port scanning; HIRFL; Golang; GTK; TCP

1. Introduction With the rapid development of computer technology, information networks have become an important guarantee for social development. The has become an indispensable tool for life. Economic, cultural, social activities, and military development are strongly dependent on the Internet. With the development of the fourth industrial revolution, network security issues have become increasingly prominent, which not only seriously hinder the development of social informatization, but also further affect the security and economic growth of the entire country. The security and reliability of the network system have become a focus of the world. Common network attacks can be divided into four types: fake message attacks, exploitable attacks, denial of service attacks, and information gathering attacks [1]. Among them, the information collection does not cause harm to the target itself, and such attacks are used to provide useful information for further intrusions. Information collection technology is a double-edged sword. On one hand, an attacker needs to collect information before an attack to carry out an effective attack. On the other hand, a network administrator can use information collection technology to discover system vulnerabilities and repair them in advance [2–4]. Network administrators usually do not hide their identities during scanning. On the contrary, attackers hide their identities. The most common information collection technology is scanning technology [5,6], which includes architecture detection and utilization of information services. There are 65,536 ports provided by TCP/IP protocol for an IP address in the computer [7]. Among them, the range of Well Known Ports is from 0 to 1023, the range of Registered Ports is from 1024 to 49,151, and the range of dynamic ports is from 49,152 to 65,535. Based on the port scanning technology,

Sensors 2020, 20, 4423; doi:10.3390/s20164423 www.mdpi.com/journal/sensors Sensors 2020, 20, 4423 2 of 12 a large number of scanners have been developed. For example, OS/MVT developed by IBM [8], the 1100 series developed by UNIVAC [9], the SATAN commercial developed by Dan Farmer and Weitse Ven [10], and the system developed by Tenable Network Security [11]. Angry IP scanner and Scanrand are very fast IP address and port scanners, which were written by Anton Keks and Dan, respectively. Scanrand reduces reliability in exchange for faster scanning speed. Other port scanning tools include unicornscan, knocker, fast port scanner, etc. At present, the most widely used open-source scanner is [12–14], which provides a variety of scanning methods that can group multiple target IP addresses for scanning, but the disadvantage is that the scan results of the can only be provided after the scan of the entire group is completed. In addition, at the Usenix International Security Symposium held on 27 March 2015, Durumeric and others from the University of Michigan in the United States proposed a scanner Zmap using stateless scanning technology. It can scan all IPv4 addresses in 45 min, which is 1300 times faster than Nmap, but its disadvantage is that it cannot find all the vulnerabilities, only one port can be scanned at a time, and it cannot cover devices using IPv6 protocol [15,16]. Masscan, which can scan the entire network in 6 min, also has a high usage rate and is currently the fastest port scanner [17,18]. Zmap and Masscan can be run under and Mac OS, but Cygwin, WinPcap, and other tools are required when using with Windows, which brings difficulties to ordinary users. Each scanner has its own advantages, but also has certain defects. Therefore, this paper combines the traditional scanning technology with the high concurrency of the Golang language to design a comprehensive cross-platform scanning system which can obtain more information about network security and provide better information support. This paper is organized as follows. Section2 describes the Common Technology of HIRFL Scanner, Section3 introduces the overall architecture of the software, and the experimental results with detailed discussion are displayed in Section4. Finally, Section5 concludes this paper with a discussion on the contribution of this paper.

2. The Common Technology of HIRFL Scanner

2.1. High Concurrency of Golang Programming Language Golang is an open-source programming language that makes it easy to build simple, reliable, and efficient software. It is a statically strongly typed, compiled language developed by Robert Griesemer, Rob Pike, and Ken Thompson of Google. It can check the most hidden program problems during compilation. Golang can be directly compiled into machine code without relying on other libraries. It uses a multi-thread model. In more detail, it is a two-level thread model. The main reason why we chose Golang to develop HIRFL Scanner is its high concurrency. Concurrency means that two or more tasks are executed within a period of time. We do not care whether these tasks are executed at a certain point in time; these tasks may or may not be executed at the same time. We only care about whether two or more tasks are solved in a short period of time (one second or two seconds). Parallel (parallelism) means that two or more tasks are executed at the same time. Concurrency is a logical concept, while parallel emphasizes the physical running state, so concurrency includes parallelism. Go implements two forms of concurrency [19,20]. The first is multi-threaded shared memory, similar to programming languages such as Java or C++. The other is the communicating sequential processes (CSP) concurrency model. This article uses the CSP model for development, which uses communication to share memory. Goroutine and channel are the key components of concurrency in the CSP model. Golang encapsulates system threads (kernel-level threads) and exposes a lightweight coroutine named goroutine (user-level threads) for users. Golang’s runtime is responsible for scheduling user-level threads to kernel-level threads. The advantage of goroutine is that the context switching is performed in the complete user mode, and there is no need to switch between the user mode and the kernel mode as frequently as threads, which saves resource consumption. Golang provides a keyword “go” Sensors 2020, 20, 4423 3 of 12

to createSensors a 2020 Golang, 20, x FOR coroutine. PEER REVIEW The Go coroutine is started when we add a keyword “go” before3 of 12 the function or method, and thus the function or method will be Run in Go coroutines. The channel is a communicationfunction or method, channel and between thus the various function concurrent or method structures will be Run in in the Go Golang coroutines. language, The channel similar is a to the communication channel between various concurrent structures in the Golang language, similar to channel in Linux. As shown in Figure1, in the communication process of two goroutines, the bu ffered the channel in Linux. As shown in Figure 1, in the communication process of two goroutines, the channel is generally used for data transmission. buffered channel is generally used for data transmission.

FigureFigure 1. 1.The The communication communication process process of of two two goroutines. goroutines.

2.2. GTK2.2. GTK The GIMPThe GIMP Toolkit Toolkit (GTK) (GTK) is is an an open-source, open-source, multi-platform-orientedmulti-platform-oriented GUI GUI toolkit toolkit whose whose source source codecode is distributed is distributed under under the the LGPL LGPL license license agreement.agreement. It It was was originally originally developed developed by Peter by Peter Mattis Mattis and Spencerand Spencer Kimball Kimball for for the the GNU GNU Image Image ManipulationManipulation Program Program (GIMP) (GIMP) to replace to replace the paid the paidMotif. Motif. At present, it is one of the mainstream development tools for GUI development and has been applied At present, it is one of the mainstream development tools for GUI development and has been applied to to more and more programs. Unlike other GUI tools such as Qt, wxWidgets, and FLTK, GTK is morecompletely and more programs.implemented Unlike in C language. other GUI tools such as Qt, wxWidgets, and FLTK, GTK is completely implementedGTK+ in can C language.be considered as the latest version of GTK. GTK contains three sets of function GTKlibraries,+ can including be considered libglib, as libgdk, the latest and version libGTK. of GTK.These GTKlibraries contains do not three use sets an ofobject-oriented function libraries, includingmechanism, libglib, so libgdk,components and libGTK.cannot be These reused, libraries and the message do not use mechanism an object-oriented is implemented mechanism, using a so componentsstandard cannot callback be mechanism, reused, and while the the message current mechanism GTK+ uses a is signal implemented mechanism. using a standard callback mechanism,GTK+ while is also the implemented current GTK in+ Cuses language; a signal howeve mechanism.r, in terms of design, object-oriented design GTK(OOD)+ is adopted also implemented flexibly. The in program C language; interface however, written inin termsGTK+ is of similar design, to object-orientedMotif, which is an design (OOD)industry-standard is adopted flexibly. GUI [21,22]. The program GTK+ contains interface many written frequently-used in GTK+ widgets,is similar such to as Motif, file selection, which is an color selection components, and so on. In addition, GTK+ provides some unique components, such industry-standard GUI [21,22]. GTK+ contains many frequently-used widgets, such as file selection, as buttons with sub-component instead of labels, and almost any widget can be placed on such colorbuttons. selection GTK+ components, allows software and so developers on. In addition, to show GTK what+ theyprovides want in some a simple unique way. components, GTK+ provides such as buttonsa good with processing sub-component tool for insteadthe internationalization of labels, and almost(i18n) and any localization widget can (i10n) be placed of the onapplication, such buttons. GTK+whichallows allows software the program developers to be edited to show without what modi theyfication, want inand a simpleonly needs way. to GTKswitch+ theprovides language a good processingdata files tool required for the by internationalization different languages. Therefore, (i18n) and it can localization be used by (i10n)people ofof different the application, languages. which allows theAs program the developer to be editedof GTK+, without the GNU modification, organization and allows only anyone needs to use switch all its the features language for free. data files requiredGTK+ by is diportablefferent and languages. has multiple Therefore, language it front can beends, used such by as people C++, Perl, of diPython,fferent TOM, languages. Ada95, Free AsPascal, the Eiffel, developer JAVA, of and GTK C#,+ etc., the In GNU this article, organization we use GTK+3.6 allows anyoneto develop to usethe display all its features interface forof free. the HIRFL Scanner. GTK+ is portable and has multiple language front ends, such as C++, Perl, Python, TOM, Ada95, Free Pascal,2.3. ICMP Ei Protocolffel, JAVA, and C#, etc. In this article, we use GTK+3.6 to develop the display interface of the HIRFL Scanner. ICMP is the abbreviation of the Internet Control Message Protocol. It is a sub-protocol of the 2.3. ICMPTCP/IP Protocol protocol suite and is used to transfer control messages between IP hosts and routers, including reporting errors, exchanging restricted control, status information, and so on. The ICMP ICMPprotocol is is the a abbreviationconnection-free of network the Internet layer Controlprotocol, Messagewhich is extremely Protocol. important It is a sub-protocol for network of the TCP/IPsecurity. protocol When suite the and IP data is used cannot to access transfer the control target or messages the IP router between cannot IP forward hosts and the data routers, packet including at reportingthe current errors, transmission exchanging rate, restricted it will control, automatically status information,send the ICMP and message. so on. TheWhen ICMP we want protocol to is a connection-freeevaluate the network network layerconnection protocol, status which, ICMP is is extremely a very useful important protocol. for network security. When the IP data cannotThe access ping program the target uses or thethe IPICMP router protocol cannot to forwarddetect whether the data the packet hosts atcan the communicate current transmission with rate, iteach will other. automatically If the ping sendcannot the reach ICMP a host, message. it indicates When that we it wantcannot to establish evaluate a connection the network with connection this host. It sends an ICMP echo request message to the destination host. The destination host must return status, ICMP is a very useful protocol. The ping program uses the ICMP protocol to detect whether the hosts can communicate with each other. If the ping cannot reach a host, it indicates that it cannot establish a connection with this host. It sends an ICMP echo request message to the destination host. The destination host must return SensorsSensors2020 2020,,20 20,, 4423 x FOR PEER REVIEW 44 ofof 1212

an ICMP echo response message to the source host. If the source host receives a response within a ancertain ICMP time, echo the response destination message host is to cons the sourceidered host.reachable. If the It source works hostas follows: receives a response within a certain time, the destination host is considered reachable. It works as follows: (1) The ping command will build a fixed format ICMP request packet, and then the ICMP protocol (1) The ping command will build a fixed format ICMP request packet, and then the ICMP protocol will hand this packet to the IP layer protocol along with the destination host’s IP address. Ping will hand this packet to the IP layer protocol along with the destination host’s IP address. Ping can can calculate the RTT (round trip time), which inserts the sending time in the data part of the packets. calculate the RTT (round trip time), which inserts the sending time in the data part of the packets. (2) The IP layer protocol takes the local IP address as the source address, appending some other (2) Thecontrol IP layerinformation, protocol and takes constructs the local an IP IP address packet. asAfter the finding source address,the MAC appendingaddress corresponding some other controlto the destination information, IP andaddress constructs in a mapping an IP packet. table, the After packet finding will the be MAC handed address over to corresponding the data link tolayer. the destinationIf the destination IP address host and in a mappingthe source table, host theare packetnot in the will same be handed network over segment, to the data this linkwill layer.turn to If the the routing destination process. host and the source host are not in the same network segment, this will (3) turn Construct to the a routing data frame process. at the data link layer, along with some control information. The destination (3) Constructaddress is athe data MAC frame address at the data passed link layer,from alongthe IP withlayer, some and control the source information. address Theis the destination physical address of is the machine. MAC address Then, passedtransfer from them the out IP according layer, and to thethe sourcemedia access address rules is the of Ethernet. physical (4) address After receiving of the machine. the data Then, frame, transfer the destinatio them outn according host first tochecks the media its destination access rules address of Ethernet. and (4) Aftercompares receiving it with the the data physical frame, address the destination of the machine. host first If checksit matches, its destination the data frame address will and be comparesreceived; otherwise, it with the the physical data frame address will ofbe thediscar machine.ded. After If itreceiving, matches, the the destination data frame host will will be received;check the otherwise,data frame, the extract data framethe IP willdata bepacket discarded. from the After frame, receiving, and give the it destination to the local host IP layer will checkprotocol. the Similarly, data frame, after extract checking the IP at datathe IP packet layer, from the useful the frame, information and give is itextracted to the local and IPhanded layer protocol.over to the Similarly, ICMP protocol. after checking After atthe the latter IP layer, proc theess, usefulthe ICMP information response is packet extracted is immediately and handed overconstructed to the ICMPand sent protocol. to the source After thehost. latter process, the ICMP response packet is immediately constructed and sent to the source host. 2.4. TCP Full Connection Port Scanning Technology and Classification 2.4. TCP Full Connection Port Scanning Technology and Classification Port scanning scans a section of the target host's port or any designated ports one by one to determinePort scanning which ports scans of the a section target host of the are target open host’s[23–27]. port Through or any the designated , portswe can one find by possible one to determinevulnerabilities which in ports the target of the host target and host fix arethem open in time [23–27. Therefore,]. Through the the scan open of port, the host we can port find can possible help us vulnerabilitiesbetter understand in the the target target host host and and fix is them the first in time. step to Therefore, doing a good the scan job ofof thestrengthening host port can security. help us betterIn understand this paper, the TCP target full hostconnection and is thetechnology first step is to adopted doing a goodto achieve job of port strengthening scanning [28,29]. security. The scanningIn this host paper, attempts TCP (using full connection TCP three-way technology handshake) is adopted to establ toish achieve a regular port connection scanning with [28, 29the]. Thedesignated scanning port host of attemptsthe destination (using TCPhost, three-wayas shown in handshake) the following to establish Figure 2. a regular connection with the designated port of the destination host, as shown in the following Figure2.

Figure 2. TCP three-way handshake port scanning. Figure 2. TCP three-way handshake port scanning. (1). When establishing a connection, the client sends a syn packet (syn = j) to the server and enters (1). theWhen SYN_SEND establishing state, a connection, waiting for the serverclient sends to confirm. a syn Whenpacket the (syn server = j) to receives the server the synand packet, enters itthe must SYN_SEND confirm state, the client’s waiting ACK for the (ack server= j + to1), co andnfirm. also When send the a SYNserver packet receives (syn the= synk), packet, that is, theit must SYN confirm+ACK packet.the client's After ACK this (ack process, = j + the1), and server also enters send thea SYN SYN_RECV packet (syn state. = k), If thethat port is, the is closed,SYN+ACK the RSTpacket. packet After will this be process, returned. the server enters the SYN_RECV state. If the port is closed, (2). Thethe RST client packet receives will the be SYNreturned.+ACK packet from the server and sends an acknowledgment packet (2). ACKThe client (ack receives= k + 1) the to SYN+ACK the server. packet After from the th packete server is and sent, sends the clientan acknowledgment and server enter packet the ESTABLISHEDACK (ack = k state+ 1) to completethe server. the After connection the packet establishment. is sent, the client and server enter the ESTABLISHED state to complete the connection establishment.

Sensors 2020, 20, 4423 5 of 12 Sensors 2020, 20, x FOR PEER REVIEW 5 of 12

We use the dial method in in the the standard standard library library of of the the net net package package to to connect. connect. The The connection connection is isstarted started by by the the system system call call connection. connection. For For each each listening listening port, port, the thecorrect correct connection connection is returned is returned if the if theport port is open, is open, otherwise otherwise a connection a connection error error is returned is returned,, indicating indicating that that the theport port is not is not accessible. accessible. In Inorder order to tofurther further improve improve the the scanning scanning rate, rate, this this ar articleticle uses uses the the high high concurrency concurrency feature feature of of GO to program. When using the Dial function to establishestablish a network connection, the DialTimeout function provided byby the the net net package package will activelywill actively pass additionalpass additional timeout timeout parameters para tometers establish to aestablish connection. a Inconnection. HIRFL Scanner, In HIRFL we Scanner, set the timeout we set ofthe TCP timeout connection of TCP to connection 100 ms. to 100 ms. According to to different different classification classification standards, standards, the port the scanning port scanning technology technology can have can different have diclassifications,fferent classifications, such as classification such as classification according accordingto protocol to type protocol and classification type and classification by port allocation by port allocation[30–32]. This[30 –paper32]. This classifies paper the classifies port scanni the portng scanningtechnology technology according according to the scanning to the scanning method: method: (1).(1). HorizontalHorizontal scanning: scanning: For For a a specific specific port, port, scan scan different different target target host hosts,s, as shown in Figure 33 below.below.

Figure 3. The diagram of the horizontal scanning schematic. (2). Vertical scanning: Scan different ports for a specific host as shown in Figure4 below. (2). Vertical scanning: Scan different ports for a specific host as shown in Figure 4 below.

Figure 4. The diagram of the vertical scanning schematic. Figure 4. The diagram of the vertical scanning schematic.

(3).(3). Block scanning: Block Block scanning scanning is is a combination of horizontal and verticalvertical scanning.scanning. It scans multiple times for different different ports of didifferentfferent hosts,hosts, asas shownshown inin FigureFigure5 5 below. below.

Figure 5. The diagram block scanning schematic.

Sensors 2020, 20, 4423 6 of 12 Sensors 2020, 20, x FOR PEER REVIEW 6 of 12

3. Structure Structure of of the the HIRFL HIRFL Scanner Scanner HIRFL ScannerScanner isis implementedimplemented in in CS CS architecture, architecture, which which is conduciveis conducive to guaranteeto guarantee the safetythe safety and andresponse response speed speed of the of system.the system. The Th maine main interface interface of theof the system system is shownis shown in in Figure Figure6 below, 6 below, which which is isdeveloped developed using using GTK GTK+3.6.+3.6. It It can can be be divided dividedinto into threethree sub-modules:sub-modules: the parameter input module, function selection module, and result output module. The parameter input module mainly enables users to input various parameters used in port scanning according to their needs. For For example, example, re regardinggarding the number of coroutines, each goroutine occupies 2 2 KB KB of of memory memory by by default. default. On On 32-bit 32-bit processors, processors, the the maximum maximum numb numberer of ofGo Go programs programs is aboutis about 80,000, 80,000, but but on on 64-bit 64-bit processors, processors, the the Go Go pr programogram has has no no limit limit on on the the number of coroutines created. In In this this way, way, the user can reasonably ente enterr the number of coroutines based on the number of scan tasks.

Figure 6. TheThe HIRFL HIRFL Scanner software’s main interface.

The second parameterparameter isis the the number number of of times times the the program program repeats repeats the the ping ping process process when when the firstthe firstping ping scan scan fails. fails. The defaultThe default value value of the of program the progra ism 2 times. is 2 times. This This value value will alsowill aalsoffect affect the scan the time.scan time.Num Num of Port of is Port the portis the number port number to be scanned. to be scanned. The program The program will automatically will automatically calculate calculate the required the requirednumber of number TCP connections. of TCP connections. In order toIn increaseorder to theincr speedease the of large-scalespeed of large-scale IP address IP and address port scanning,and port scanning,the timeout the period timeout of TCPperiod connections of TCP connections is 100 ms is by 100 default ms by in default this system. in this The system. function The selectionfunction selectionmodule is module the core isof the this core system, of this and system, it mainly and includesit mainly IPincludes address IP online address scanning, online scanning, port scanning, port scanning,and mixed and scanning mixed (ipscanning+ port (ip scanning). + port scanning). The user The can completeuser can complete the task ofthe scanning task of scanning by selecting by selectingdifferent functions.different functions. When the When system the is scanning,system is thescanning, scanned the results scanned will result be displayeds will be in displayed real-time in real-timethe result in output the result module. output After module. completing After completing the scanning the task, scanning the system task, willthe system inform will the userinform of the userfinal of result the final of the result scan of in the formscan in of the a dialog form box.of a dialog box. The main programprogram is is developed developed with with go1.13.4, go1.13.4, and and the the core core packages packages are net,are net, sync, sync, icmp, icmp, and ipv4.and ipv4.Package Package net provides net provides a portable a portable interface interface for network for network I/O, including I/O, including TCP/ IP,TCP/IP, UDP, domainUDP, domain name nameresolution, resolution, and Unix and domain Unix domain sockets. sockets. We use theWe DialTimeoutuse the DialTimeout method in method the net packagein the net to package receive the to receiveprotocol, the IP protocol, address, port IP address, number, port and thenumber, timeout an period.d the timeout Package period. sync provides Package basic sync synchronization provides basic synchronizationprimitives such asprimitives mutual exclusion such as mutual locks. Mutexexclusion is used locks. to Mutex solve the is used problem to solve of data the competition, problem of whiledata competition, WaitGroup while solves WaitGroup the problem solves of coroutine the problem synchronization. of coroutine synchronization. Package icmp provides Package basicicmp providesfunctions basic for the functions manipulation for the of manipulation messages used of in messages the Internet used Control in the Message Internet Protocols, Control Message ICMPv4 Protocols, ICMPv4 and ICMPv6. The ipv4 package is used to implement the IP level socket option

Sensors 2020, 20, 4423 7 of 12

and ICMPv6.Sensors 2020 The, 20, xipv4 FOR PEER package REVIEW is used to implement the IP level socket option for the Internet7 of Protocol 12 version 4. Other packages used in the development of HIRFL Scanner include bufio, os, errors, fmt, time,for etc. the Internet Protocol version 4. Other packages used in the development of HIRFL Scanner Figureinclude7 bufio,shows os, the errors, workflow fmt, time, of the etc. system. Due to the separate design of the front and back end, the systemFigure first loads7 shows the the GTK workflow GUI graphic of the system. display Due file. to Inthe the separate process design of parameter of the front and and IP back address end, the system first loads the GTK GUI graphic display file. In the process of parameter and IP verification, a return represents that the user needs to check the input parameters or IP address. The IP address verification, a return represents that the user needs to check the input parameters or IP address of this program is read from the TXT file. address. The IP address of this program is read from the TXT file.

FigureFigure 7. 7.The The flowflow chartchart of HIRFL HIRFL Scanner. Scanner.

4. Experiment4. Experiment and and Result Result

4.1. Data4.1. Data Description Description and and Preprocess Preprocess In orderIn order to verify to verify the the scanning scanning rate rate and and correctness correctness ofof thethe HIRFL Scanner Scanner system, system, we we conducted conducted a seriesa ofseries experiments of experiments on the on Lanzhouthe Lanzhou heavy heavy ion ion accelerator accelerator control control network network and and compared compared it with it with the industry-renowned scanning software Nmap and Masscan scanners. As shown in Table 1, the IP the industry-renowned scanning software Nmap and Masscan scanners. As shown in Table1, the IP addresses to be scanned come from the HIRFL control network. There is a total of 13,915 IP addresses addressesin 55 toVLANs, be scanned excluding come network fromthe addresses, HIRFL broadc controlast network. addresses, There and isgateways. a total of The 13,915 IP address IP addresses is in 55exported VLANs, from excluding the MYSQL network database addresses, to a TXT broadcastfile for the scanner addresses, to load. and The gateways. operating The system IP addressof the is exportedHIRFL from Scanner the MYSQL and Nmap database is windows to a TXT 7 64-bit, file forand the the scanner CPU is Intel to load. Core The I7-6567U operating 3.3 GHz, system with of the HIRFL16GB Scanner memory. and Masscan Nmap is uses windows the same 7 64-bit,hardware and enviro the CPUnment, is Inteland the Core operating I7-6567U system 3.3 GHz, is Centos with 7. 16GB memory.In order Masscan to improve uses thethe sameaccuracy hardware of the test environment, results, all experimental and the operating results are system the average is Centos values 7. In of order to improvethe three the tests, accuracy which of were the conducted test results, under all experimental different network results load are periods. the average We use values Nmap ofwith the a three tests,graphical which were interface conducted Zenmap under 7.80, and diff theerent version network of Masscan load periods.is 1.0.6. We use Nmap with a graphical interface Zenmap 7.80, and the version of Masscan is 1.0.6.

Sensors 2020, 20, 4423 8 of 12

Table 1. All IP addresses used in port scanning.

VLAN IP Address VLAN_DESC VLAN IP Address VLAN_DESC 10 10.10.2.0/24 Central room 107 10.10.107.0/24 CSRm 14 10.10.14.0/24 Single particle 108 10.10.108.0/24 CSRm 15 10.10.15.0/24 Single particle 109 10.10.109.0/24 CSRm 16 10.10.16.0/24 SSC 110 10.10.110.0/24 CSRm 17 10.10.17.0/24 SSC 111 10.10.111.0/24 CSRm 41 10.10.41.0/24 T128 112 10.10.112.0/24 CSRm 42 10.10.42.0/24 T128 113 10.10.113.0/24 Langdao North 43 10.10.43.0/24 T128 114 10.10.114.0/24 Langdao North 45 10.10.45.0/24 SFC 115 10.10.115.0/24 Langdao North 46 10.10.46.0/24 SFC 116 10.10.116.0/24 Langdao South 48 10.10.48.0/24 SFC 117 10.10.117.0/24 Langdao South 51 10.10.51.0/24 HIRFL 118 10.10.118.0/24 Langdao South 52 10.10.52.0/24 HIRFL 119 10.10.119.0/24 Langdao South 53 10.10.53.0/24 HIRFL 120 10.10.120.0/24 RIBLL2 54 10.10.54.0/24 HIRFL 121 10.10.121.0/24 RIBLL2 90 10.10.90.0/24 HIRFL 122 10.10.122.0/24 RIBLL2 91 10.10.91.0/24 HIRFL 123 10.10.123.0/24 RIBLL2 92 10.10.92.0/24 HIRFL 124 10.10.124.0/24 RIBLL2 95 10.10.95.0/24 HIRFL 125 10.10.125.0/24 RIBLL2 96 10.10.96.0/24 HIRFL 126 10.10.126.0/24 CSRe 99 10.10.99.0/24 RIBLL1 127 10.10.127.0/24 CSRe 100 10.10.100.0/24 RIBLL1 128 10.10.128.0/24 CSRe 101 10.10.101.0/24 RIBLL1 129 10.10.129.0/24 CSRe 102 10.10.102.0/24 CSRm 130 10.10.130.0/24 CSRe 103 10.10.103.0/24 CSRm 131 10.10.131.0/24 CSRe 104 10.10.104.0/24 CSRm 998 172.16.110.0/24 virtualization 105 10.10.105.0/24 CSRm 999 172.16.100.0/24 virtualization 106 10.10.106.0/24 CSRm

4.2. The Comparisons of IP Address Online and Port State Independent Detection We first divide all the IP addresses on the accelerator into 6 groups by number for the ping scan test. In this experiment, Nmap was scanned in three time modes: T3 (normal), T4 (aggressive), and T5 (insane); the sP parameter is used to perform a ping scan without further testing, such as for port scans or scans. Then we conducted port scanning 6 times for a server with IP address: 10.10.100.125. The number of scanning coroutines of HIRFL Scanner is 3000. The number of ping repetitions is 3, and the timeout period is 100 ms. The IP scanning results are shown in Table2. HS Time represents the scanning time used by the HIRFL Scanner. As the number of scanning ports increases, the scanning time of the HIRFL scanner and Nmap is increasing. For Nmap, we can see that T5 mode can take the least amount of time to complete the scan. The HIRFL Scanner can complete the scanning of all IP addresses in about 44 s, and the scanning speed is significantly faster than Nmap, which finished the scanning process after 378.5 s. When the number of IPs is 2530, the HIRFL scanner is 93.19% faster than Nmap using T5 mode. Judging from the scanning results, the scanning results of the HIRFL scanner and Nmap are basically the same, and the maximum deviation is 4. This maximum deviation refers to the number of inconsistencies between the HIRFL scanner and Nmap scanning results, mainly the number of false positives. In the vertical scanning experiment, Nmap was scanned in sS (TCP SYN) and sT (TCP connect) modes, respectively, and the time template was T5. For HIRFL Scanner, the coroutine is set to 3000 and the timeout is 100 ms. Table3 shows the vertical scanning results of the HIRFL Scanner and Nmap Port Scanner. The data shows that the accuracy of the two port scanners is basically the same, and the deviation may be caused by packet loss. For Nmap, sS mode is significantly faster than sT. The scanning speed of the HIRFL Scanner is also better than that of Nmap. In the small-scale port Sensors 2020, 20, 4423 9 of 12 scanning, the maximum speedup ratio is 98.33%, while in the full port scanning, the speed is increased by 44.68%.

Table 2. Scanning results of HIRFL Scanner and Nmap IP Scanner.

Nmap Time (s) Nmap Result Num of IP HS Time (s) HS Result T3 T4 T5 T3 T4 T5 253 4.12 21.83 21.53 20.77 47 47 47 47 2530 5.98 102.85 96.77 87.75 201 201 201 201 5060 11.45 184.58 175.52 159.10 442 441 441 441 7590 16.57 267.90 247.58 228.15 657 656 656 656 10,120 30.42 329.27 311.39 286.92 843 839 840 841 13,915 44.08 436.21 415.03 378.50 1143 1140 1139 1140

Table 3. The vertical scanning results of HIRFL scanner and Nmap.

Nmap T5 Time (s) Nmap T5 result Num of Ports HS Time (s) HS Result sS sT sS sT 1000 0.31 18.77 65.10 9 9 9 10,000 4.12 21.36 251.92 12 12 12 20,000 7.32 23.68 456.72 11 12 12 30,000 10.84 27.18 662.92 13 13 13 40,000 14.99 29.00 868.51 15 15 15 65,535 20.65 37.34 1391.98 16 17 17

4.3. The Comparisons of Scanning Results of IP Devices with Different Port Numbers in Accelerator Control Network We scanned each port of all devices in the accelerator experiment. A total of 912 million ports of 13,915 devices were scanned. For Nmap, we choose T5 and sS parameters to accelerate scanning. The coroutines and timeout of the HIRFL scanner are set to 3000 and 50 ms, respectively. Table4 summarizes the statistical results of the top ten services running on each port in this experiment. When using Nmap to scan, it took a week to complete all port scans, while the HIRFL Scanner shortened the time to 38.65 h. It can be seen from Table4 that there are many services of HIRFL system equipment running on non-standard ports, and Nmap only scans ports from 1 to 1024 by default, and those services running on non-standard ports cannot be accurately identified. Similarly, it can be observed that the port scanning statistics of HIRFL Scanner and Nmap have deviations. The maximum deviation is 7, which may be caused by the scanning time period. The error between them is mainly based on false positives.

Table 4. The block scanning results of HIRFL Scanner and Nmap.

Port Service HS Count Nmap Count Variance 80 http 262 256 6 3389 ms-wbt-server 155 148 7 445 microsoft-ds 143 147 4 22 ssh 139 143 4 111 rpcbind 107 110 3 49,152 unknow 106 110 4 23 telnet 102 107 5 49,153 unknow 99 101 2 5064 channel access 92 97 5 59,110 ni-psp 80 83 3 Sensors 2020, 20, 4423 10 of 12

Sensors 2020, 20, x FOR PEER REVIEW 10 of 12 4.4. The Comparison of Hit Rate when Using the Shodan Dataset 4.4. The Comparison of Hit Rate when Using the Shodan Dataset In this experiment, we use the scanning results of Shodan [33,34] as the standard to scan devices in theIn Shodan this experiment, database we that use provide the scanning FTP, SSH, results Telnet, of Shodan SMTP, [33,34] HTTP, as and the POP3standard services to scan in devices China. Accordingin the Shodan to the database data in thethat Shodan provide database FTP, SSH, on 5 Telnet, June 2020, SMTP, there HTTP, are 1,037,806 and POP3 devices services providing in China. FTP servicesAccording in China.to the data We chose in the 10,000 Shodan of themdatabase to perform on 5 June the 2020, scanning there experiment, are 1,037,806 so thedevices denominator providing is 10,000,FTP services and other in protocolsChina. We also chose use this10,000 configuration. of them to The perform hit rate the is used scanning to evaluate experiment, the accuracy so the of thedenominator scanner, its is definition10,000, and is other as follows: protocols also use this configuration. The hit rate is used to evaluate the accuracy of the scanner, its definition is as follows: total number detected by the scanner Hit rate= ℎ (1) = 10000 (1) (1) 10000 Because the scanning process is performed via the Internet, there may be situations such as Because the scanning process is performed via the Internet, there may be situations such as that affect the scanning results, so we continue to adopt the method of taking the network congestion that affect the scanning results, so we continue to adopt the method of taking the average of three tests. The hit rate of each scanner is shown in Figure8. For Nmap, we continue to average of three tests. The hit rate of each scanner is shown in Figure 8. For Nmap, we continue to select T5 and sS parameters to speed up the scanning. The coroutines and timeout of HIRFL Scanner are select T5 and sS parameters to speed up the scanning. The coroutines and timeout of HIRFL Scanner set to 3000 and 100 ms, respectively. Masscan’s packet sending speed is set to 1000 packets per second, are set to 3000 and 100 ms, respectively. Masscan's packet sending speed is set to 1000 packets per and it has the best scanning speed performance, but the scanning accuracy is quite low. The scanning second, and it has the best scanning speed performance, but the scanning accuracy is quite low. The accuracy of HIRFL Scanner is basically consistent with Nmap, and the maximum error of the hit rate is scanning accuracy of HIRFL Scanner is basically consistent with Nmap, and the maximum error of 0.07. The inconsistency may be caused by network packet losses. the hit rate is 0.07. The inconsistency may be caused by network packet losses.

1.2

1

0.8

0.6 Hit rate 0.4

0.2

0 FTP SSH Telnet SMTP HTTP POP3

HS Nmap Masscan Service

Figure 8. The comparison of hit rate.

5. Conclusions Port scanning scanning is is very very useful useful for fordefensive defensive penetration penetration testing testing of HIRFL of HIRFL devices. devices. Scanning Scanning HIRFL HIRFLdevices devicescan determine can determine which whichservices services are exposed are exposed to the to network, the network, therefore therefore we can we cancheck check the theconfiguration configuration of each of each device device in a in targeted a targeted way. way. In addition, In addition, we wecan can take take preventive preventive measures measures to toreduce reduce the the losses losses caused caused by bymalicious malicious attacks. attacks. Base Basedd on the on thehigh high concurrency concurrency characteristics characteristics of the of theGolang Golang language, language, this paper this paper develops develops a large-scale a large-scale IP address IP address and port and scanning port scanning tool: HIRFL tool: Scanner. HIRFL Scanner.The scanner The adopts scanner CS adopts architecture CS architecture and employs and employs GTK to GTKdevelop to develop the front-end the front-end GUI interface, GUI interface, so as soto achieve as to achieve the purpose the purpose of separati of separatingng the front the end front and end back and end. back The end. most The important most important feature feature of this oftool this is toolthe cross-platform is the cross-platform and user-friendly and user-friendly operatio operationn interface. interface. It allows It allowsthe user the to user specify to specify an IP anrange IP rangeor port or number port number (comma (comma separated separated list), and list), th ande number the number of goroutines of goroutines the user the wants user wantsto create to createat runtime. at runtime. We used We the used HIRFL the HIRFL control control network network and Shodan and Shodan data sets data to sets verify to verify the accuracy the accuracy and andscanning scanning rate rateof the of theHIRFL HIRFL Scanner Scanner system. system. Co Comparativemparative experiments experiments show show that that the the system’s scanning rate is significantly superior to the Nmap scanner, and the accuracy is basically the same as Nmap, which meets our application needs.

Sensors 2020, 20, 4423 11 of 12 scanning rate is significantly superior to the Nmap scanner, and the accuracy is basically the same as Nmap, which meets our application needs.

Author Contributions: Conceptualization and writing—original draft preparation, C.Y. and J.D.; interface design, M.Y.; algorithm implementation and experimental analysis, C.Y. and T.M.; investigation, project administration and funding acquisition, J.D. and M.Y. All authors have read and agreed to the published version of the manuscript. Funding: This research is supported by the Natural Science Foundation of P. R. of China (61762060), Natural Science Foundation of Gansu Province (0914ZTB18). Conflicts of Interest: The authors declare no conflict of interest.

References

1. Galeano-Brajones, J.; Carmona-Murillo, J.; Valenzuela-Valdes, J.F.; Luna-Valero, F. Detection and mitigation of dos and ddos attacks in iot-based stateful sdn: An experimental approach. Sensors 2020, 20, 816. [CrossRef] [PubMed] 2. Iyamuremye, B.; Shima, H. Network Security Testing Tools for SMES (Small and Medium Enterprises); IEEE: New York, NY, USA, 2018; pp. 414–417. 3. Lai, Y.; Gao, H.; Liu, J. Vulnerability mining method for the modbus tcp using an anti-sample fuzzer. Sensors 2020, 20, 2040. [CrossRef][PubMed] 4. Birkinshaw, C.; Rouka, E.; Vassilakis, V.G. Implementing an intrusion detection and prevention system using software-defined networking: Defending against port-scanning and denial-of-service attacks. J. Netw. Comput. Appl. 2019, 136, 71–85. [CrossRef] 5. Li, Z.; Yu, X.; Wang, D.; Liu, Y.; He, S. Supereye: A distributed port scanning system. In Proceedings of the International Conference on Artificial Intelligence and Security, New York, NY, USA, 26–28 July 2019; pp. 45–56. 6. Sanz, I.J.; Lopez, M.A.; Mattos, D.M.F.; Duarte, O.C.M.B. A cooperation-aware virtual network function for proactive detection of distributed port scanning. In Proceedings of the Cyber Security in Networking Conference, Rio de Janeiro, Brazil, 18–20 October 2017. 7. Dhivvya, J.P.; Muralidharan, D.; Raj, N.; Kumar, B.K. Network simulation and vulnerability assessment tool for an enterprise network. In Proceedings of the International Conference on Computing Communication and Networking Technologies, Kanpur, India, 6–8 July 2019. 8. Bowie, W.S.; Linders, J.G. A software trace facility for os/mvt. Softw. Pract. Exp. 1979, 9, 535–545. [CrossRef] 9. Borgerson, B.; Godfrey, M.; Hagerty, P.; Rykken, T. The architecture of sperry univac 1100 series systems. Proc. Annu. Symp. Comput. Archit. 1979, 137–146. 10. Arce, I.J.N.S. Vulnerability management at the crossroads. Netw. Secur. 2008, 5, 11–13. 11. Millwater, H.R.; Wu, Y.T.; Dias, J.B.; Mcclung, R.C.; Raveendra, S.T.; Thacker, B.H. In The nessus software system for probabilistic structural analysis. Struct. Saf. Reliab. ASCE 1989, 3, 2283–2290. 12. Shah, M.; Ahmed, S.; Saeed, K.; Junaid, M.; Khan, H.; Ata-ur-Rehman. Penetration testing active reconnaissance phase—optimized port scanning with nmap tool. In Proceedings of the 2nd International Conference on Computing, Mathematics and Engineering Technologies, Sukkur, Pakistan, 30–31 January 2019. 13. De Vivo, M.; Carrasco, E.; Isern, G.; de Vivo, G.O.J.A.S.C.C.R. A review of port scanning techniques. Comput. Commun. Rev. 1999, 29, 41–48. [CrossRef] 14. Rohrmann, R.R.; Ercolani, V.J.; Patton, M.W. In Large scale port scanning through tor using parallel nmap scans to scan large portions of the ipv4 range. In Proceedings of the IEEE International Conference on Intelligence & Security Informatics, Beijing, China, 22–24 July 2017. 15. Durumeric, Z.; Wustrow, E.; Halderman, J.A. Zmap: Fast internet-wide scanning and its security applications. In Proceedings of the 22nd USENIX Security Symposium, Washington, DC, USA, 14–16 August 2013. 16. Mazel, J.; Strullu, R. Identifying and characterizing scans: A cryptanalytic approach. arXiv 2019, arXiv:1908.04193. 17. Lee, S.; Im, S.Y.; Shin, S.H.; Roh, B.H.; Lee, C. Implementation and vulnerability test of stealth port scanning attacks using zmap of censys engine. In Proceedings of the International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Korea, 19–21 October 2016. Sensors 2020, 20, 4423 12 of 12

18. De Santis, G.; Lahmadi, A.; Francois, J.; Festor, O. Modeling of ip scanning activities with hidden markov models: Darknet case study. In Proceedings of the 8th IFIP International Conference on New Technologies, Mobility and Security (NTMS), Larnaca, Cyprus, 21–23 November 2016. 19. Dilley, N.; Lange, J. An empirical study of messaging passing concurrency in go projects. In Proceedings of the IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), Hangzhou, China, 24–27 February 2019. 20. Fava, D.S.; Steffen, M. Ready, Set, Go! Data-Race Detection and the Go Language; Elsevier: Amsterdam, The Netherlands, 2019; Volume 195. 21. Emoto, M.; Shibata, K.; Watanabe, K.; Ohdachi, S.; Ida, K.; Sudo, S. Development of a flexible visualization tool. Fusion Eng. Des. 2002, 60, 367–371. [CrossRef] 22. Garea, A.S.; Heras, D.B.; Argüello, F. An open source desktop application for classification of remote sensing data. In Proceedings of the IEEE International Conference on Intelligent Data Acquisition & Advanced Computing Systems: Technology & Applications, Warsaw, Poland, 24–26 September 2015. 23. Patel, S.K.; Sonker, A. In Internet protocol identification number based ideal stealth port scan detection using snort. In Proceedings of the International Conference on Computational Intelligence and Communication Networks, Tehri, India, 23–25 December 2016; pp. 422–427. 24. Myers, D.; Foo, E.; Radke, K. Internet-wide scanning taxonomy and framework, artificial intelligence and symbolic computation. In Proceedings of the 13th Australasian Information Security Conference, Sydney, Australia, 27–30 January 2015; pp. 61–65. 25. Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. Surveying port scans and their detection methodologies. Comput. J. 2011, 54, 1565–1581. [CrossRef] 26. Bou-Harb, E.; Debbabi, M.; Assi, C. On fingerprinting probing activities. Comput. Secur. 2014, 43, 35–48. [CrossRef] 27. Durumeric, Z.; Bailey, M.; Halderman, J.A.; Assoc, U. An internet-wide view of internet-wide scanning. In Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, 20–22 August 2014; pp. 65–78. 28. Sivanathan, A.; Gharakheili, H.H.; Sivaraman, V. Can we classify an iot device using tcp port scan? In Proceedings of the IEEE International Conference on Information and Automation for Sustainability (ICIAFS), Sri Lanka, India, 21–22 December 2018. 29. Kumar, S.; Dutta, K.; Asati, A. Two Pass Port Scan Detection Technique Based on Connection Pattern and Status on Sampled Data. J. Comput. Commun. 2015, 3, 1–8. [CrossRef] 30. Jajal, B.; Iyer, S.; Chauhan, D. Mitigating illicit entry using website port scanning tools in indian context. In Proceedings of the International Conference on Computing for Sustainable Global Development, New Delhi, India, 13–15 March 2019. 31. Gupta, S. Port. Scanning: Uncover key Concepts of Website and Network Security; Springer: Berlin, Germany, 2019. 32. Ensafi, R.; Park, J.C.; Kapur, D.; Crandall, J.R. Idle port scanning and non-interference analysis of network protocol stacks using model checking. In Proceedings of the Usenix Security Symposium, Washington, DC, USA, 11–13 August 2010. 33. Bodenheim, R.; Butts, J.; Dunlap, S.; Mullins, B. Evaluation of the ability of the shodan search engine to identify internet-facing industrial control devices. Int. J. Crit. Infrastruct. Prot. 2014, 7, 114–123. [CrossRef] 34. Fernandez-Carames, T.M.; Fraga-Lamas, P. Teaching and learning iot cybersecurity andvulnerability assessment with shodan through practical use cases. Sensors 2020, 20, 3048. [CrossRef][PubMed]

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).