Brad-Wardman-Dissertation1.Pdf

Total Page:16

File Type:pdf, Size:1020Kb

Brad-Wardman-Dissertation1.Pdf A SERIES OF METHODS FOR THE SYSTEMATIC REDUCTION OF PHISHING by BRADLEY WARDMAN ANTHONY SKJELLUM, COMMITTEE CHAIR ALAN SPRAGUE CHENGCUI ZHANG JOSE NAZARIO JOSEPH POPINSKI A DISSERTATION Submitted to graduate faculty of the University of Alabama at Birmingham, in partial fulfillment of the requirements for the degree of Doctor of Philosophy BIRMINGHAM, ALABAMA 2011 Copyright by Brad Wardman 2011 A SERIES OF METHODS FOR THE SYSTEMATIC REDUCTION OF PHISHING BRADLEY WARDMAN COMPUTER AND INFORMATION SCIENCES ABSTRACT Phishing continues to expand as efforts to thwart attacks are ineffective and criminals behind these scams operate with apparent impunity. In order to address both issues, this research provides three steps towards the reduction of phishing: identifying phishing websites, collecting phishing evidence, and correlating the phishing incidents. The first step is to identify phishing websites automatically. Experimental results demonstrate that content-based algorithms can classify phishing websites with greater than 90% detection rates while maintaining low false-positive rates. Next, the development of custom software collects additional information and evidence about these phishing websites. In the final step, this research offers two novel algorithms to be employed as clustering metrics for phishing website content. The three steps in this research reduce phishing by blocking potential victims from the malicious content through email filters and browser-based toolbars, gathering evidence against the criminal(s) that is usable by incident investigators, and revealing relationships between phishing websites that can provide investigators with deeper knowledge of phishing activity and thus help to prioritize their apparently, limited resources. i DEDICATION To my loving wife and supportive family, thank you ii TABLE OF CONTENTS 1. Introduction ..................................................................................................................1 1.1 Introduction Statement ..........................................................................................1 1.2 Problem Statement ................................................................................................2 1.2.1 Victim Activities ............................................................................................8 1.2.2 Criminal Activities .......................................................................................12 1.3 Solutions ..............................................................................................................12 1.3.1 New Contributions to Victim Activity .........................................................13 1.3.2 New Contributions to Criminal Activity......................................................17 1.3.3 Reduction of Victim Website Compromise Through Identification of Vulnerable Applications ...................................................18 1.4 Overall Contributions ..........................................................................................19 1.5 Metrics .................................................................................................................21 1.6 Extensibility ........................................................................................................22 1.7 Outline of Dissertation ........................................................................................22 2. Related Work .............................................................................................................25 2.1 Why Fall for Phish? .............................................................................................25 2.2 User Education ....................................................................................................27 2.3 Blacklists & Toolbars ..........................................................................................28 2.4 Email-Based ........................................................................................................31 iii 2.5 URL-Based ..........................................................................................................36 2.6 Content-Based .....................................................................................................39 2.7 Content-Based and URL- Based .........................................................................41 2.7 Phishing Prosecutions .........................................................................................44 2.8 Phishing activity aggregation ..............................................................................46 3. Victim Activities ........................................................................................................48 3.1 Characterization of phishing victims ...................................................................48 3.2 Contributions to victim activities ........................................................................49 3.2.1 Content-based approaches ...........................................................................50 3.2.2 URL-based approaches ................................................................................50 3.3 Current Problems.................................................................................................51 3.3.1 Human verification and takedowns .............................................................51 3.3.2 Blacklists ......................................................................................................54 3.3.3 Fast Phish Detection through Automated Solutions ....................................56 3.4 Content-Based Solutions To Fast Phish Detection .............................................57 3.4.1 Initial Process ...............................................................................................57 3.4.2 Main Index Matching ...................................................................................58 3.4.3 Deep MD5 Matching ...................................................................................60 3.4.4 The UAB Phishing Data Mine .....................................................................70 3.4.5 Partial MD5 Matching .................................................................................74 iv 3.4.6 Deeper Analysis of file Differences .............................................................77 3.4.7 Content-Based Experiments .........................................................................81 3.5 Metrics ...............................................................................................................117 3.6 Summary ...........................................................................................................119 4. Criminal Activities ...................................................................................................121 4.1 Introduction to criminal activities .....................................................................121 4.2 Contributions to deterrence of criminal activities .............................................122 4.3 Gathering phishing evidence .............................................................................123 4.3.1 Phishing Kit Scraper .................................................................................123 4.3.2 Kit Email Extractor ...................................................................................124 4.3.3 Summary of gathering evidence ................................................................133 4.4 Clustering phishing websites .............................................................................133 4.4.1 Initial Deep MD5 Clustering .....................................................................134 4.4.2 Reeling in Big Phish with a Deep MD5 Net ..............................................134 4.4.3 SLINK-style Deep MD5 Clustering ..........................................................147 4.4.4 Attribution of phishing website to kits.......................................................151 4.4.5 Conclusions on Deep MD5 Clustering ......................................................152 4.4.6 Fingerprinting Files: New Tackle to Catch A Phisher ..............................152 4.5 Conclusions .......................................................................................................162 5. Future Work and Conclusions .................................................................................163 v 5.1 Contributions .....................................................................................................164 5.2 Future Work and Extensibility ..........................................................................167 5.2.1 Future Work ...............................................................................................168 5.2.2 Extensibility of Syntactical Fingerprinting ................................................175 5.3 Conclusions .......................................................................................................179 Appendix A – URL-Based Solutions to Phish Detection ................................................182 A.1 Initial Study – Identifying Vulnerable Websites by Analysis of Common Strings in Phishing URLs ..........................................................182 A.1.1 INTRODUCTION .....................................................................................182 A.1.2 Related Work .............................................................................................184 A.1.3 Method .......................................................................................................187
Recommended publications
  • Ispconfig 3 Manual]
    [ISPConfig 3 Manual] ISPConfig 3 Manual Version 1.0 for ISPConfig 3.0.3 Author: Falko Timme <[email protected]> Last edited 09/30/2010 1 The ISPConfig 3 manual is protected by copyright. No part of the manual may be reproduced, adapted, translated, or made available to a third party in any form by any process (electronic or otherwise) without the written specific consent of projektfarm GmbH. You may keep backup copies of the manual in digital or printed form for your personal use. All rights reserved. This copy was issued to: Thomas CARTER - [email protected] - Date: 2010-11-20 [ISPConfig 3 Manual] ISPConfig 3 is an open source hosting control panel for Linux and is capable of managing multiple servers from one control panel. ISPConfig 3 is licensed under BSD license. Managed Services and Features • Manage one or more servers from one control panel (multiserver management) • Different permission levels (administrators, resellers and clients) + email user level provided by a roundcube plugin for ISPConfig • Httpd (virtual hosts, domain- and IP-based) • FTP, SFTP, SCP • WebDAV • DNS (A, AAAA, ALIAS, CNAME, HINFO, MX, NS, PTR, RP, SRV, TXT records) • POP3, IMAP • Email autoresponder • Server-based mail filtering • Advanced email spamfilter and antivirus filter • MySQL client-databases • Webalizer and/or AWStats statistics • Harddisk quota • Mail quota • Traffic limits and statistics • IP addresses 2 The ISPConfig 3 manual is protected by copyright. No part of the manual may be reproduced, adapted, translated, or made available to a third party in any form by any process (electronic or otherwise) without the written specific consent of projektfarm GmbH.
    [Show full text]
  • An Analysis of the Nature of Groups Engaged in Cyber Crime
    International Journal of Cyber Criminology Vol 8 Issue 1 January - June 2014 Copyright © 2014 International Journal of Cyber Criminology (IJCC) ISSN: 0974 – 2891 January – June 2014, Vol 8 (1): 1–20. This is an Open Access paper distributed under the terms of the Creative Commons Attribution-Non- Commercial-Share Alike License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. This license does not permit commercial exploitation or the creation of derivative works without specific permission. Organizations and Cyber crime: An Analysis of the Nature of Groups engaged in Cyber Crime Roderic Broadhurst,1 Peter Grabosky,2 Mamoun Alazab3 & Steve Chon4 ANU Cybercrime Observatory, Australian National University, Australia Abstract This paper explores the nature of groups engaged in cyber crime. It briefly outlines the definition and scope of cyber crime, theoretical and empirical challenges in addressing what is known about cyber offenders, and the likely role of organized crime groups. The paper gives examples of known cases that illustrate individual and group behaviour, and motivations of typical offenders, including state actors. Different types of cyber crime and different forms of criminal organization are described drawing on the typology suggested by McGuire (2012). It is apparent that a wide variety of organizational structures are involved in cyber crime. Enterprise or profit-oriented activities, and especially cyber crime committed by state actors, appear to require leadership, structure, and specialisation. By contrast, protest activity tends to be less organized, with weak (if any) chain of command. Keywords: Cybercrime, Organized Crime, Crime Groups; Internet Crime; Cyber Offenders; Online Offenders, State Crime.
    [Show full text]
  • December 2013 Feature Article: the Year of Surviving Dangerously: Highlights from We Live Security 2013
    December 2013 Feature Article: The Year of Surviving Dangerously: Highlights from We Live Security 2013 Table of Contents The Year of Surviving Dangerously: Highlights from We Live Security 2013 .............................................................3 2013: a Scammer’s Eye View ................................................................................................................................... 11 ESET Corporate News .............................................................................................................................................. 18 The Top Ten Threats ................................................................................................................................................ 18 Top Ten Threats at a Glance (graph) ....................................................................................................................... 21 About ESET .............................................................................................................................................................. 22 Additional Resources ............................................................................................................................................... 22 This month we decided to present a retrospective of all 2013, pertained to Java vulnerability CVE-2013-0422 being added to a so we develop and delve into the most prominent threats that couple of popular exploit packs, thus making it more accessible each month had. Also published an article that specifically
    [Show full text]
  • Technical Report RHUL–MA–2014– 2 01 September 2014
    Leveraging knowledge sharing for preventing and investigating on-line banking frauds: On-line Fraud Centre Salvatore Camillo Zammataro Technical Report RHUL–MA–2014– 2 01 September 2014 Information Security Group Royal Holloway, University of London Egham, Surrey TW20 0EX, United Kingdom www.ma.rhul.ac.uk/tech Salvatore Camillo “Toto” Zammataro Leveraging knowledge sharing for preventing and investigating on-line banking frauds: On-line Fraud Centre MSc Information Security Project Report Supervisor: Dr John Austen Submitted as part of the requirements for the award of the MSc in Information Security at Royal Holloway, University of London. 1 EXECUTIVE SUMMARY For social, technological and political reasons electronic transactions are now the most used payment method in Europe. As such, fraudsters have been focusing on On-Line transactions to gain money through Phishing and Crimeware. These kind of frauds generates losses for EU citizen of an estimated value of 250M€/year. Banks and Law Enforcement Agencies are engaged in the prevention, detection and prosecution of this crime. Some limit of actual legislation (i.e. Data Protection, International Treaties on cyber-crime, Fraud prosecution laws), low speed of communication between Banks and LEAs, and the fraudster’s speed in taking advantage of weaknesses of the system leave the space for improvement. To improve the countermeasures that Banks and LEAs have deployed, this paper suggests the adoption of an InfoSharing Service between national banking system and Law Enforcement Agencies. This service uses a “hub-and-spoke” framework, where LEA is the hub and banks are the spokes. Target of the service is to shorten the time needed to communicate from Banking Fraud Managers to LEAs, and to share the relevant information on fraudster accounts in the whole banking industry.
    [Show full text]
  • Apache Web Server ______
    Apache Web Server _____________________________________________________________________________________________________ Original author(s) Robert McCool Developer(s) Apache Software Foundation Initial release 1995[1] 2.4.9 (March 17, 2014) [±] Stable release Development Active status Written in C, Forth, XML[2] Type Web server License Apache License 2.0 Website httpd.apache.org The Apache HTTP Server , commonly referred to as Apache , is a web server application notable for playing a key role in the initial growth of the World Wide Web.[3] Originally based on the NCSA HTTPd server, development of Apache began in early 1995 after work on the NCSA code stalled. Apache quickly overtook NCSA HTTPd as the dominant HTTP server, and has remained the most popular HTTP server in use since April 1996. In 2009, it became the first web server software to serve more than 100 million websites.[4] Apache is developed and maintained by an open community of developers under the auspices of the Apache Software Foundation. Most commonly used on a Unix-like system,[5] the software is available for a wide variety of operating systems, including Unix, FreeBSD, Linux, Solaris, Novell NetWare, OS X, Microsoft Windows, OS/2, TPF, OpenVMS and eComStation. Released under the Apache License, Apache is open-source software. As of June 2013, Apache was estimated to serve 54.2% of all active websites and 53.3% of the top servers across all domains.[6][7][8][9][10] 1 Apache Web Server _____________________________________________________________________________________________________ Name According to the FAQ in the Apache project website, the name Apache was chosen out of respect to the Native American tribe Apache and its superior skills in warfare and strategy.
    [Show full text]
  • Busting Blocks: Revisiting 47 U.S.C. §230 to Address the Lack of Effective Legal Recourse for Wrongful Inclusion in Spam Filters
    Digital Commons @ Touro Law Center Scholarly Works Faculty Scholarship 2011 Busting Blocks: Revisiting 47 U.S.C. §230 to Address the Lack of Effective Legal Recourse for Wrongful Inclusion in Spam Filters Jonathan I. Ezor Touro Law Center, [email protected] Follow this and additional works at: https://digitalcommons.tourolaw.edu/scholarlyworks Part of the Commercial Law Commons, Communications Law Commons, and the Science and Technology Law Commons Recommended Citation Jonathan I. Ezor, Busting Blocks: Revisiting 47 U.S.C. § 230 to Address the Lack of Effective Legal Recourse for Wrongful Inclusion in Spam Filters, XVII Rich. J.L. & Tech. 7 (2011), http://jolt.richmond.edu/ v17i2/article7.pdf. This Article is brought to you for free and open access by the Faculty Scholarship at Digital Commons @ Touro Law Center. It has been accepted for inclusion in Scholarly Works by an authorized administrator of Digital Commons @ Touro Law Center. For more information, please contact [email protected]. Richmond Journal of Law and Technology Vol. XVII, Issue 2 BUSTING BLOCKS: REVISITING 47 U.S.C. § 230 TO ADDRESS THE LACK OF EFFECTIVE LEGAL RECOURSE FOR WRONGFUL INCLUSION IN SPAM FILTERS By Jonathan I. Ezor∗ Cite as: Jonathan I. Ezor, Busting Blocks: Revisiting 47 U.S.C. § 230 to Address the Lack of Effective Legal Recourse for Wrongful Inclusion in Spam Filters, XVII Rich. J.L. & Tech. 7 (2011), http://jolt.richmond.edu/v17i2/article7.pdf. I. INTRODUCTION: E-MAIL, BLOCK LISTS, AND THE LAW [1] Consider a company that uses e-mail to conduct a majority of its business, including customer and vendor communication, marketing, and filing official documents.
    [Show full text]
  • A Requirements-Based Exploration of Open-Source Software Development Projects – Towards a Natural Language Processing Software Analysis Framework
    Georgia State University ScholarWorks @ Georgia State University Computer Information Systems Dissertations Department of Computer Information Systems 8-7-2012 A Requirements-Based Exploration of Open-Source Software Development Projects – Towards a Natural Language Processing Software Analysis Framework Radu Vlas Georgia State University Follow this and additional works at: https://scholarworks.gsu.edu/cis_diss Recommended Citation Vlas, Radu, "A Requirements-Based Exploration of Open-Source Software Development Projects – Towards a Natural Language Processing Software Analysis Framework." Dissertation, Georgia State University, 2012. https://scholarworks.gsu.edu/cis_diss/48 This Dissertation is brought to you for free and open access by the Department of Computer Information Systems at ScholarWorks @ Georgia State University. It has been accepted for inclusion in Computer Information Systems Dissertations by an authorized administrator of ScholarWorks @ Georgia State University. For more information, please contact [email protected]. PERMISSION TO BORROW In presenting this dissertation as a partial fulfillment of the requirements for an advanced degree from Georgia State University, I agree that the Library of the University shall make it available for inspection and circulation in accordance with its regulations governing materials of this type. I agree that permission to quote from, to copy from, or publish this dissertation may be granted by the author or, in his/her absence, the professor under whose direction it was written or, in his absence, by the Dean of the Robinson College of Business. Such quoting, copying, or publishing must be solely for the scholarly purposes and does not involve potential financial gain. It is understood that any copying from or publication of this dissertation which involves potential gain will not be allowed without written permission of the author.
    [Show full text]
  • Awstats Logfile Analyzer Documentation
    AWStats logfile analyzer 7.4 Documentation Table of contents Release Notes Reference manual What is AWStats / Features Install, Setup and Use AWStats New Features / Changelog Configuration Directives/Options Upgrade Configuration for Extra Sections feature Contribs, plugins and resources Other provided utilities Glossary of terms Other Topics Comparison with other log analyzers FAQ and Troubleshooting Benchmarks AWStats License Plugin Developement AWStats modules for third tools AWStats module for Webmin AWStats module for Dolibarr ERP & CRM Table of contents 1/103 14/07/2015 AWStats logfile analyzer 7.4 Documentation What is AWStats / Features Overview AWStats is short for Advanced Web Statistics. AWStats is powerful log analyzer which creates advanced web, ftp, mail and streaming server statistics reports based on the rich data contained in server logs. Data is graphically presented in easy to read web pages. AWStats development started in 1997 and is still developed today by same author (Laurent Destailleur). However, development is now done on "maintenance fixes" or small new features. Reason is that author spend, since July 2008, most of his time as project leader on another major OpenSource projet called Dolibarr ERP & CRM and works also at full time for TecLib, a french Open Source company. A lot of other developers maintains the software, providing patches, or packages, above all for Linux distributions (fedora, debian, ubuntu...). Designed with flexibility in mind, AWStats can be run through a web browser CGI (common gateway interface) or directly from the operating system command line. Through the use of intermediary data base files, AWStats is able to quickly process large log files, as often desired.
    [Show full text]
  • Characterizing Spam Traffic and Spammers
    2007 International Conference on Convergence Information Technology Characterizing Spam traffic and Spammers Cynthia Dhinakaran and Jae Kwang Lee , Department of Computer Engineering Hannam University, South Korea Dhinaharan Nagamalai Wireilla Net Solutions Inc, Chennai, India, Abstract upraise of Asian power houses like China and India, the number of email users have increased tremendously There is a tremendous increase in spam traffic [15]. These days spam has become a serious problem these days [2]. Spam messages muddle up users inbox, to the Internet Community [8]. Spam is defined as consume network resources, and build up DDoS unsolicited, unwanted mail that endangers the very attacks, spread worms and viruses. Our goal is to existence of the e-mail system with massive and present a definite figure about the characteristics of uncontrollable amounts of message [4]. Spam brings spam and spammers. Since spammers change their worms, viruses and unwanted data to the user’s mode of operation to counter anti spam technology, mailbox. Spammers are different from hackers. continues evaluation of the characteristics of spam and Spammers are well organized business people or spammers technology has become mandatory. These organizations that want to make money. DDoS attacks, evaluations help us to enhance the existing technology spy ware installations, worms are not negligible to combat spam effectively. We collected 400 thousand portion of spam traffic. According to research [5] most spam mails from a spam trap set up in a corporate spam originates from USA, South Korea, and China mail server for a period of 14 months form January respectively. Nearly 80% of all spam are received from 2006 to February 2007.
    [Show full text]
  • Index Images Download 2006 News Crack Serial Warez Full 12 Contact
    index images download 2006 news crack serial warez full 12 contact about search spacer privacy 11 logo blog new 10 cgi-bin faq rss home img default 2005 products sitemap archives 1 09 links 01 08 06 2 07 login articles support 05 keygen article 04 03 help events archive 02 register en forum software downloads 3 security 13 category 4 content 14 main 15 press media templates services icons resources info profile 16 2004 18 docs contactus files features html 20 21 5 22 page 6 misc 19 partners 24 terms 2007 23 17 i 27 top 26 9 legal 30 banners xml 29 28 7 tools projects 25 0 user feed themes linux forums jobs business 8 video email books banner reviews view graphics research feedback pdf print ads modules 2003 company blank pub games copyright common site comments people aboutus product sports logos buttons english story image uploads 31 subscribe blogs atom gallery newsletter stats careers music pages publications technology calendar stories photos papers community data history arrow submit www s web library wiki header education go internet b in advertise spam a nav mail users Images members topics disclaimer store clear feeds c awards 2002 Default general pics dir signup solutions map News public doc de weblog index2 shop contacts fr homepage travel button pixel list viewtopic documents overview tips adclick contact_us movies wp-content catalog us p staff hardware wireless global screenshots apps online version directory mobile other advertising tech welcome admin t policy faqs link 2001 training releases space member static join health
    [Show full text]
  • Computer Group Webinar Session 1 Handout (Online/Internet Fraud)
    NORTH KENT MIND – COMPUTER GROUP WEBINAR SESSION 1 HANDOUT (ONLINE/INTERNET FRAUD) In this session, we will be covering online fraud (internet fraud) and how to protect you against it. What is Online/ Internet Fraud? Internet fraud is a type of fraud or deception, which makes use of the Internet and could involve hiding of information or providing incorrect information for the purpose of tricking victims out of money, property, and inheritance. The most common online frauds include: Greeting Card Scams Whether it’s Christmas or Easter, we all get all kind of holiday greeting cards in our email inbox that seem to be coming from a friend or someone we care. Greeting card frauds are another old Internet scams, If you open such an email and click on the card, you usually end up with malicious software that is being downloaded and installed on your operating system. The malware may be an annoying program that will launch pop- ups with ads, unexpected windows all over the screen. If this happens, your computer may also start sending private data and financial information to a fraudulent server controlled by IT criminals. Greeting Card Scam Example: 1 North Kent Mind – Computer Group Webinar Session 1 Handout Phishing Email Scams Phishing is a type of online scam where criminals send an email that appears to be from a legitimate company and ask you to provide sensitive information. This is usually done by including a link that will appear to take you to the company’s website to fill in your information – but the website is a clever fake and the information you provide goes straight to the crooks behind the fraud.
    [Show full text]
  • Download Appendix A
    Appendix A Resources Links This list includes most of the URLs referenced in this book. For more online resources and further reading, see our website at http://bwmo.net/ Anti-virus & anti-spyware tools • AdAware, http://www.lavasoftusa.com/software/adaware/ • Clam Antivirus, http://www.clamav.net/ • Spychecker, http://www.spychecker.com/ • xp-antispy, http://www.xp-antispy.de/ Benchmarking tools • Bing, http://www.freenix.fr/freenix/logiciels/bing.html • DSL Reports Speed Test, http://www.dslreports.com/stest • The Global Broadband Speed Test, http://speedtest.net/ • iperf, http://dast.nlanr.net/Projects/Iperf/ • ttcp, http://ftp.arl.mil/ftp/pub/ttcp/ 260! The Future Content filters • AdZapper, http://adzapper.sourceforge.net/ • DansGuard, http://dansguardian.org/ • Squidguard, http://www.squidguard.org/ DNS & email • Amavisd-new, http://www.ijs.si/software/amavisd/ • BaSoMail, http://www.baso.no/ • BIND, http://www.isc.org/sw/bind/ • dnsmasq, http://thekelleys.org.uk/dnsmasq/ • DJBDNS, http://cr.yp.to/djbdns.html • Exim, http://www.exim.org/ • Free backup software, http://free-backup.info/ • Life with qmail, http://www.lifewithqmail.org/ • Macallan Mail Server, http://macallan.club.fr/ • MailEnable, http://www.mailenable.com/ • Pegasus Mail, http://www.pmail.com/ • Postfix, http://www.postfix.org/ • qmail, http://www.qmail.org/ • Sendmail, http://www.sendmail.org/ File exchange tools • DropLoad, http://www.dropload.com/ • FLUFF, http://www.bristol.ac.uk/fluff/ Firewalls • IPCop, http://www.ipcop.org/ • L7-filter, http://l7-filter.sourceforge.net/
    [Show full text]