Google Dorks: Use Cases and Adaption Study
Total Page:16
File Type:pdf, Size:1020Kb
Google dorks: Use cases and Adaption study UNIVERSITY OF TURKU Department of Future Technologies Master of Science in Technology Thesis Networked Systems Security October 2020 Reza Abasi Supervisors: Dr. Ali Farooq Dr. Antti Hakkala The originality of this thesis has been checked in accordance with the University of Turku quality assurance system using the Turnitin OriginalityCheck service. i UNIVERSITY OF TURKU Department of Future Technologies Reza Abasi: Google dorks: Use cases and adaption study Master of Science in Technology Thesis, 93 pages. Networked Systems Security October 2020 The information age brought about radical changes in our lives. More and more assets are getting connected to the Internet. On the one hand, the connectivity to this ever-growing network of connected devices and assets (the Internet) precipitates more convenience and access to various resources. However, on the downside, the Internet could be the hotbed for malicious actors like hackers, attackers, and cybercriminals’ communities. Continuous Penetration testing and monitoring of the sites, and forums providing illicit digital products and services is a must-do task nowadays. Advanced searching techniques could be employed for discovering such forums and sites. Google dorks that are utilizing Google’s advanced searching techniques could be applied for such purpose. Google dorks could be used for other areas that we will explain during this thesis in more detail like information gathering, vulnerability detection, etc. The purpose of this thesis is to propose advanced searching techniques that will help cybersecurity professionals in information gathering, reconnaissance, vulnerability detection as well as cyber criminal investigative tasks. Further, a usability study has been conducted to examine the acceptance of these techniques among a group of cybersecurity professionals. In this usability study, we will measure the significance of 5 variables in the innovation diffusion model (IDT) namely Complexity, Compatibility, Relative advantage, Trialability, and observability in the adoption of Google dorks for search- related tasks for cybersecurity professionals. Keywords: Google dorks, Cybercriminal forums, Information gathering, Dark web, Defaced sites, Innovation diffusion theory ii Table of Contents List of figures ............................................................................... vi 1 Introduction ............................................................................... 1 2 Problem statement and literature review ............................... 3 2.1 Practical Problems ............................................................................... 3 2.2 Literature Review ................................................................................ 5 3 Search engine hacking ............................................................ 10 3.1 Description of the process ................................................................. 10 3.2 Most commonly used search queries ................................................. 11 3.3 Basic search queries: intitle, intext, inurl, site, ext, intitle:” index of” 11 3.4 Creating more advanced queries........................................................ 14 3.4.1 Use Case: Finding mail subdomain of governmental sites with google dorks ........................................................................................................................ 14 3.5 Duckduckgo and Bing search queries ............................................... 15 3.6 Stop words for search engines ........................................................... 18 4 Information gathering ............................................................ 20 4.1 Introduction to information gathering ............................................... 20 4.2 Sensitive information disclosure, utilizing exploit-db ...................... 22 4.2.1 Files containing interesting information ........................................................ 23 4.2.2 Files containing usernames ............................................................................ 25 4.2.3 Files containing passwords ............................................................................ 27 4.2.4 Enumeration using robots.txt and sitemap.xml files ...................................... 29 4.2.5 Find emails and passwords from file sharing sites, paste sites ...................... 31 4.2.6 Find files containing emails and passwords in sites designed by WordPress 34 4.2.7 Find emails and passwords from random sites .............................................. 35 4.3 Vulnerability detection and enumeration .......................................... 35 4.3.1 Finding sites probably vulnerable to SQL injection, XSS ............................. 36 4.3.2 Web server detection ...................................................................................... 36 4.3.3 Popular CMS, forum software sensitive file and folder enumeration ............ 39 4.3.4 Error Messages, log files ................................................................................ 43 4.3.5 Vulnerable servers .......................................................................................... 44 4.3.6 Vulnerable files .............................................................................................. 45 4.3.7 Web asset and online device discovery .......................................................... 45 4.4 Automation tools for using dorks ...................................................... 48 5 Cyber investigation Use Cases ............................................... 49 5.1 Threat intelligence hunting ................................................................ 49 iii 5.2 Defaced Sites ..................................................................................... 49 5.2.1 Finding defaced sites by the same defacer group........................................... 52 5.2.2 Finding defacers’ telegram, Facebook, twitter, skype, and other social media accounts ................................................................................................................... 53 5.3 Cybercriminal activities ..................................................................... 55 5.3.1 Cybercriminal forums .................................................................................... 55 5.3.2 Finding cybercrime channels in telegram ...................................................... 57 5.3.3 Finding cybercrimes in paste sites ................................................................. 58 5.3.4 Finding autobuy shops using Google dorks: selly.gg, shoppy.gg .................. 58 5.3.5 Finding ICQ and vk.com channels, and Facebook groups offering cybercriminal products ............................................................................................ 59 5.3.6 Find carding, hacking sites in darknet using dorks and pastebin sites ........... 60 5.3.7 Finding darknet criminal sites using Google dork and web2tor proxies ....... 61 5.3.8 Use Case: Find Clearnet of an onion site with Google dorks ........................ 63 6 Usability Study ........................................................................ 65 6.1 Experimental design .......................................................................... 65 6.2 Measures ............................................................................................ 69 6.3 Research Method ............................................................................... 70 6.4 Result and Discussion ........................................................................ 70 6.4.1 Scales reliability and validity testing ............................................................. 72 6.4.2 Descriptive statistics ...................................................................................... 73 6.4.3 Correlation and Multiple Regression Analysis .............................................. 75 6.5 Discussion, Limitation and Future Research ..................................... 80 7 Conclusion ................................................................................ 81 References ................................................................................... 82 Appendix: Measurement items ................................................. 92 iv List of tables Table 1: Duckduckgo.com search syntax ....................................................................... 16 Table 2: Bing.com search syntax .................................................................................... 17 Table 3: Demographic statistics ...................................................................................... 71 Table 4: Standard deviation, mean, and Cronbach‘s alpha reliability ............................ 73 Table 5: Comparison between the mean and standard deviation of pretest and posttest surveys (Paired Sample Test) .......................................................................................... 74 Table 6: Inter-correlation of constructs .......................................................................... 76 Table 7: Inter-correlation of constructs .......................................................................... 76 Table 8 :Multiple Regression(Pretest) .......................................................................... 77 Table 9: Hypothesis test for pretest ............................................................................... 78 Table 10: Multiple Regression(Post-test) .....................................................................