A Forensic Web Log Analysis Tool: Techniques And
Total Page:16
File Type:pdf, Size:1020Kb
A FORENSIC WEB LOG ANALYSIS TOOL: TECHNIQUES AND IMPLEMENTATION Ann Fry A thesis in The Department of Concordia Institute for Information Systems Engineering Presented in Partial Fulfillment of the Requirements For the Degree of Master of Information Systems Security Concordia University Montreal,´ Quebec,´ Canada September 2011 c Ann Fry, 2011 Concordia University School of Graduate Studies This is to certify that the thesis prepared By: Ann Fry Entitled: A Forensic Web Log Analysis Tool: Techniques and Implementa- tion and submitted in partial fulfillment of the requirements for the degree of Master of Information Systems Security complies with the regulations of this University and meets the accepted standards with respect to originality and quality. Signed by the final examining commitee: Chair Dr. Benjamin Fung External Examiner Dr. Otmane Ait-Mohamed Examiner Dr. Amir Youssef Supervisor Dr. Mourad Debbabi Approved Dr. Mourad Debbabi Chair of Department or Graduate Program Director 20 Dr. Nabil Esmail, Dean Faculty of Engineering and Computer Science Abstract A Forensic Web Log Analysis Tool: Techniques and Implementation Ann Fry Methodologies presently in use to perform forensic analysis of web applications are decidedly lacking. Although the number of log analysis tools available is exceedingly large, most only employ simple statistical analysis or rudimentary search capabilities. More precisely these tools were not designed to be forensically capable. The threat of online assault, the ever growing reliance on the performance of necessary services conducted online, and the lack of efficient forensic methods in this area provide a background outlining the need for such a tool. The culmination of study emanating from this thesis not only presents a forensic log analysis framework, but also outlines an innovative methodology of analyzing log files based on a concept that uses regular expressions, and a variety of solutions to problems associated with existing tools. The implementation is designed to detect critical web application security flaws gleaned from event data contained within the access log files of the underlying Apache Web Service (AWS). Of utmost importance to a forensic investigator or incident responder is the generation of an event timeline preceeding the incident under investigation. Regular expressions power the search capability of our framework by enabling the detection of a variety of injection-based attacks that represent significant timeline interactions. The knowledge of the underlying event structure of each access log entry is essential to efficiently parse log files and determine timeline interactions. Another feature added to our tool includes the ability to modify, remove, or add regular expressions. This feature addresses the need for investigators to adapt the environment to include investigation specific queries along with suggested default signatures. The regular expressions are signature definitions used to detect attacks toward both applications whose functionality requires a web service and the service itself. The tool provides a variety of default vulnerability signatures to scan for and outputs resulting detections. iii Acknowledgments I would like to thank Dr. Mourad Debbabi, my advisor, for his continuous support and invaluable guidance throughout my academic program. He welcomed me to the Forensics research team in September of 2006. He gave me the freedom to pursue independent ideas, while challenging me with “why, what, how” from time to time to help me clarify my mind. My appreciation extends to the members on my graduation committee: Dr. Amir Youssef, Dr. Otmane Ait-Mohamed, and Dr. Benjamin Fung. Their valuable suggestions helped enhance the content of this thesis. My sincere thanks to my fellow researchers: Adam, Marc-Andre, Nadia, Hatim (R.I.P.), Asaad and Ali. I would also like to thank in particular Lehan Meng for assistance completing the finishing touches on the implementation, reference checking and an effective user guide. Special appreciation is extended to Yosr Jarraya for assistance in corrections and revisions. Special thanks go to Emerson, Simon, Ali, Simsong, Metr0, gdead, Moose and Daemons. Finally and most importantly, my deepest appreciation goes to my parents and my family, without whose support this would not have been possible. Mom, Dad, Tricia, Mike, Joey, Liam and Elliot; the furry members, Luna, Sunshine, Murphy, Merlin, Nexus (R.I.P.) and Orion (R.I.P.); I love you. Many thanks to them for their continuous support, understanding, licks and nudges of encouragement insisting that I get back to work. iv Contents List of Figures xi List of Tables xii Glossary xiv Acronyms xvi 1 Introduction 1 1.1Motivations......................................... 1 1.2ProblemStatement..................................... 4 1.3Objectives.......................................... 5 1.4Contributions........................................ 5 1.5Organization........................................ 7 2 Preliminaries 8 2.1LogAnalysis........................................ 8 2.2WebApplications...................................... 10 2.2.1 ClientSide..................................... 11 2.2.2 ServerSide..................................... 15 2.3OtherApplications..................................... 30 v 2.3.1 Gateway....................................... 30 2.3.2 Tunnels . ..................................... 31 2.3.3 Agents........................................ 32 2.3.4 Caches........................................ 33 2.3.5 Proxies....................................... 33 2.3.6 RetailWebsites................................... 34 2.3.7 BankingandE-Commerce............................. 35 2.3.8 Online E-Mail, Social Networking, Web Logs, and Interactive Information Sites 35 2.3.9 CorporateDevelopedIntranetorVendorSuppliedApplications........ 37 2.4RegularExpressions.................................... 38 2.4.1 Syntax........................................ 39 2.4.2 Regular Expression Control Sequences . ..................... 46 2.5Summary.......................................... 48 3 Log Analysis Tools 50 3.1ApplicationSpecificLogAnalysisTools......................... 51 3.1.1 AcctandAcctsum................................. 51 3.1.2 Analog........................................ 52 3.1.3 Anteater....................................... 52 3.1.4 AWStats...................................... 53 3.1.5 BreadboardBIWebAnalytics.......................... 54 3.1.6 Calamaris...................................... 55 3.1.7 Chklogs....................................... 55 3.1.8 COREWisdom................................... 56 3.1.9 EventLogAnalyzer................................. 56 vi 3.1.10FtpweblogandWwwstat............................. 57 3.1.11 Funnel WebR Analyzer.............................. 57 3.1.12Http-analyze.................................... 58 3.1.13Logjam....................................... 58 3.1.14 Logparser ...................................... 59 3.1.15Lire......................................... 60 3.1.16 Logrep ........................................ 60 3.1.17 Logstalgia ...................................... 60 3.1.18Mywebalizer.................................... 61 3.1.19OpenWebAnalytics................................ 61 3.1.20Pyflag........................................ 62 3.1.21Sawmill....................................... 62 3.1.22SquidjandScansquidlog.............................. 63 3.1.23 Swatch, Logsurfer and Tenshi ........................... 64 3.1.24Visitors....................................... 66 3.1.25Weblogmon..................................... 67 3.2LogAnalysisDevelopmentFrameworks......................... 67 3.2.1 Apachedb, mod log sql andtheApache::DBProject.............. 67 3.2.2 CascadeSoftwarePackages............................ 68 3.2.3 CrystalReports................................... 69 3.2.4 JasperReports................................... 70 3.2.5 SimpleEventCorrelator(SEC).......................... 70 3.3IntrusionDetectionandIntegrityMonitoringTools................... 71 3.3.1 ArcSight Logger and ArcSight ESM ....................... 72 vii 3.3.2 Guard26....................................... 73 3.3.3 OsirisandSamhain................................ 73 3.3.4 OSSec........................................ 75 3.3.5 Snort........................................ 75 3.3.6 Splunk........................................ 76 3.4Conclusion......................................... 77 4 Attack Detection Using Regular Expressions 83 4.1ExploitingWebApplications............................... 84 4.2Methodology........................................ 84 4.3InjectionAttacks...................................... 85 4.3.1 Protocol-RelatedInjection............................. 85 4.3.2 SQLInjection.................................... 87 4.3.3 OperatingSystemCommandInjection...................... 89 4.3.4 Local or Remote File Inclusion .......................... 93 4.4CrossSiteRequestForgery................................ 96 4.4.1 DetectsURL-,Name-,JSON,andReferrer-ContainedPayloads........ 97 4.5CrossSiteScripting.................................... 99 4.5.1 XML-JavascriptDOM,PropertiesorMethods................. 100 4.5.2 JavascriptConcatenationPatterns........................ 102 4.5.3 DetectsJavaScriptObfuscatedbyBaseEncoding................ 102 4.5.4 DetectsHTMLBreakingInjection........................ 103 4.5.5 AttributeBreakingInjections........................... 104 4.5.6 VBScriptInjection................................. 105 4.6DenialofService.....................................