A Solution for Automatically Malicious Web Shell and Web Application Vulnerability Detection
Total Page:16
File Type:pdf, Size:1020Kb
A Solution for Automatically Malicious Web Shell and Web Application Vulnerability Detection B Van-Giap Le, Huu-Tung Nguyen, Dang-Nhac Lu, and Ngoc-Hoa Nguyen( ) VNU University of Engineering and Technology, Hanoi, Vietnam {giaplv 57,tungnh 57,nhacld.di11,hoa.nguyen}@vnu.edu.vn Abstract. According to Internet Live Stats, it is evident that organiza- tions and developers are underestimating security issues on their system. In this paper, we propose a protective and extensible solution for auto- matically detecting both the Web application vulnerabilities and mali- cious Web shells. Based on the original THAPS, we proposed E-THAPS that has a new detecting mechanism, improved SQLi, XSS and vulner- able functions detecting capabilities. For malicious Web shell detection, taint analysis and pattern matching methods are selected as the main approach. The broad experiment that we performed showed our out- standing results in comparison with other solutions for detecting the Web application vulnerabilities and malicious Web shells. Keywords: Web application vulnerability · Malicious Web shell · Taint analysis · Pattern matching · SQLi detection · XSS detection 1 Introduction In April 2016, according to Internet Live Stats, there is an enormous amount of attacked Websites every day, causing both direct and significant impact on nearly 3.36 billion Internet users [7]. Even with security specialists, in some cases, still having troubles when coming up with unfamiliar systems because of the complexity of testing processes and the vast amount of testing cases. As a result, the number of hacked Websites per day is linearly increased: from 25.000 Hacked Website per day on April 2015 to 54.700 Hacked Website per day on April 2016 [7]. These current issues in Web application security raised a need for a solu- tion that allows Web developers and security researchers detect security-related problems in the easiest way. In this paper, we propose an extensible solution for automatically detecting Web application vulnerabilities and malicious Web Shells, called GuruWS, which uses white-box testing techniques. We focus on the Web applications built by the PHP language because the proportion of PHP in server-side programming languages is remaining very high through many years as W3Techs: about 82.3 % of all the websites [8]. c Springer International Publishing Switzerland 2016 N.T. Nguyen et al. (Eds.): ICCCI 2016, Part I, LNAI 9875, pp. 367–378, 2016. DOI: 10.1007/978-3-319-45243-2 34 368 V.-G. Le et al. The rest of this paper are organized as following: In Sect. 2 we refer to some basic principles and related works. Section 3 details our extensible solution for automatically detecting the Web vulnerabilities and malicious Web Shells. In Sect. 4 we summary our experiment to verify and benchmark our approach. The last section is dedicated to some conclusions and future works. 2 Background and Related Work 2.1 Vulnerability Scanning in PHP Web Applications There are two main approaches in finding PHP application vulnerabilities by testing: black-box testing and white-box testing. The former one is particularly prefered to find flaws in Web applications. This method operates by launch- ing attacks against an application using the fuzzing technique. They are both time and resource consuming in practical because of fuzzing limitations. For the white-box testing, it is not commonly used for finding security flaws in Web applications. The main reasons can be listed as the limited detection capability of white-box analysis tools, the heterogeneous programming environments, and the complexity of applications [1]. However, with many efforts to fade these lim- itations away, many white-box vulnerability scanners are released and popularly using by millions of customers at these days such that: – RIPS [9] is a PHP vulnerability scanner using static analysis. As our practical experiments, RIPS scan very fast, yet, the False/Positive rate is still quite high, and it also lacks in object-oriented supporting [3] which is an advantage feature of GuruWS. – THAPS1 is a very efficient scanner which applies symbolic execution as its static analysis approach and performs a taint analysis as the post process to detect flaws. Symbolic execution is a term in computer science, which denotes the process of analyzing what inputs cause each part of a program to exe- cute. To identify vulnerabilities, the taint analysis identifies user-controllable variables and how they propagated through the application. With every user- controllable variable, every time it reaches a potentially dangerous function (a vulnerable sink) without being properly sanitized first, a vulnerability is reported [2]. 2.2 Malicious Web Shell Detection A Web Shell is defined as a script that can be run on a Web server to enable remote administration of the infected server. For detecting malicious Web Shells, we can use different approaches such (i) pattern matching, (ii) combining lexical analysis and taint analysis, and (iii) using statistical methods. Here are some typical ideas for the Web Shell detection: 1 https://bitbucket.org/heinep/thaps/. A Solution for Automatically Malicious Web Shell and Web Application VD 369 – Web Shell Detector [10] is a Python tool that helps on detecting Web Shells. This product is a quite good solution as it is easy in using, developing and customizing. However, the Web Shell pattern set in Web Shell Detector data- base is old and also very limited. Moreover, it is not able to detect tiny Web Shells as well as self-written Web Shells, due to the taint analysis mechanism lacking. – NeoPI2 is a Python script, uses statistical techniques to detect obfuscated and encrypted content within source code. Its approach is based on the recursive scanning and ranking of all files in the base directory [4]. This solution requires experiments in Web security major to validate if it is Web Shell or not. 3 Solution The GuruWS system architecture can be illustrated in the following Fig. 1. Fig. 1. GuruWS system architecture – Core consists of grVulnScanner and grMalwrScanner modules. Each module runs simultaneously as each dependent and extensible service. In short: • grVulnScanner is a white-box Web application vulnerabilities scanner. The foundation of it is E-THAPS (Enhanced THAPS) which is the improved version of THAPS. • grMalwrScanner is objective to detect malicious Web application files based on the pattern matching and taint analysis methods. – UserProject is the place where stores extracted users projects which will be the inputs for grVulnScanner and grMalwrScanner modules. – View aims to support users access GuruWS’s features, help them upload their compressed Web source codes and acquire results in a convenient way. – Database is used to store scan requests, scan process status and scan results. – Allocator takes the role of performing an efficient and flexible interaction between Core and Database. It has to get scan requests from Database and then to call Core’s components to handle these requests. 2 https://github.com/Neohapsis/NeoPI. 370 V.-G. Le et al. 3.1 grMalwrScanner The primary objective of grMalwrScanner is supporting the developers, Web- masters and security specialists to detect malicious files in their project. In their perspective, they will proactively give it the source codes for getting the answer to two questions: (i) Does their application contain malicious files? and (ii) If malicious files exist, where these files were located on their application? To satisfy all of their pretensions, our first step is to detect simple Web Shells. Because of their flexibility, we decided to use taint analysis method. For the second one, there are many general Web Shells, which were protected by encoding themselves challenge the method based on taint analysis. Recognizing that there is a limited number of popular Web Shells belonged to this type, we propose another method. The later depends on patterns from their identities. One key idea in our works is to use all of the available approaches for the corresponding type of Web Shells. Taint Analysis: This method is performed as following: firstly, the code is split into tokens (the lexical analysis process) to make it easier to manipulate and perform post analysis. Then, grMalwrScanner analyses the token list of each file only once (to improve the speed) in which it passes through the token list and identifies impor- tant tokens by name. Thus, potential dangerous functions (PDFs) are determined, then all signif- icant arguments of these functions will be traced back to their ‘source’, that includes: – Other inputs: get headers(), get browser() and so on. – User inputs: $ GET, $ POST, $ COOKIE and $ FILES as well as other $ SERVER and $ ENV variables. – Server parameters: HTTP ACCEPT, HTTP KEEP ALIVE and so on. – File input: fgets(), dlob(), readdir() and so on. – Database input: mysql fetch array(), mysql fetch object() and so on. There are some basic principles in grMalwrScanner taint analysis process: – The source is always marked as tainted. – The string created from tainted variables is also marked as tainted. – With a function (not belonged to secure functions or PDFs), if it has any tainted input arguments, its return value will be marked as tainted. – With every function in PDF list, there will be a set of corresponding securing functions. Hence, when significant arguments of a PDF is traced back, any argument passed through a securing function will make an untainted return value even though this is a tainted variable. Regarding taint analysis approach, grMalwrScanner system just supports in detecting PHP Web Shells at the current time. A Solution for Automatically Malicious Web Shell and Web Application VD 371 Pattern Matching: After investing in the wide-range Web Shells collecting (in various type and programming language namely ASP, PHP, Perl, Python and so on) from reliable sources: https://sourceforge.net/p/laudanum/code/25/tree/ and some repositories on https://github.com/: /tennc/webshell, /shiqiaomu/ webshell-collector, /tdifg/WebShell, /BlackArch/webshells, /JohnTroony/ other-webshells, /lhlsec/webshell, /fuzzdb-project/fuzzdb, /JohnTroony/php- webshells.