A Systematic Empirical Analysis of Unwanted Software Abuse, Prevalence, Distribution, and Economics
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSIDAD POLITECNICA´ DE MADRID ESCUELA TECNICA´ SUPERIOR DE INGENIEROS INFORMATICOS´ A Systematic Empirical Analysis of Unwanted Software Abuse, Prevalence, Distribution, and Economics PH.D THESIS Platon Pantelis Kotzias Copyright c 2019 by Platon Pantelis Kotzias iv DEPARTAMENTAMENTO DE LENGUAJES Y SISTEMAS INFORMATICOS´ E INGENIERIA DE SOFTWARE ESCUELA TECNICA´ SUPERIOR DE INGENIEROS INFORMATICOS´ A Systematic Empirical Analysis of Unwanted Software Abuse, Prevalence, Distribution, and Economics SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF: Doctor of Philosophy in Software, Systems and Computing Author: Platon Pantelis Kotzias Advisor: Dr. Juan Caballero April 2019 Chair/Presidente: Marc Dasier, Professor and Department Head, EURECOM, France Secretary/Secretario: Dario Fiore, Assistant Research Professor, IMDEA Software Institute, Spain Member/Vocal: Narseo Vallina-Rodriguez, Assistant Research Professor, IMDEA Networks Institute, Spain Member/Vocal: Juan Tapiador, Associate Professor, Universidad Carlos III, Spain Member/Vocal: Igor Santos, Associate Research Professor, Universidad de Deusto, Spain Abstract of the Dissertation Potentially unwanted programs (PUP) are a category of undesirable software that, while not outright malicious, can pose significant risks to users’ security and privacy. There exist indications that PUP prominence has quickly increased over the last years, but the prevalence of PUP on both consumer and enterprise hosts remains unknown. Moreover, many important aspects of PUP such as distribution vectors, code signing abuse, and economics also remain unknown. In this thesis, we empirically and sys- tematically analyze in both breadth and depth PUP abuse, prevalence, distribution, and economics. We make the following four contributions. First, we perform a systematic study on the abuse of Windows Authenticode code signing by PUP and malware. We build an infrastructure that classifies potentially ma- licious samples as PUP or malware and use this infrastructure to evaluate 356K sam- ples. We show that most signed samples are PUP and that malware is not commonly signed. We also evaluate the efficacy of Certification Authority (CA) defenses such as identity checks and revocation. Our results suggest that CA identity checks pose some barrier to malware, but do not affect PUP. CA revocations are equally low for both malware and PUP. We conclude that current CA defenses are largely ineffective for PUP. Second, we measure the prevalence of unwanted software on real consumer hosts using telemetry from 3.9 million hosts. We find PUP installed in 54% of the hosts in our dataset. We also analyze the commercial pay-per-install (PPI) service ecosystem showing that commercial PPI services play a major role in the distribution of PUP. Third, we perform an analysis of enterprise security and measure the prevalence of both malware and PUP on real enterprise hosts. We use AV telemetry collected from 28K enterprises and 67 industry sectors with over 82M client hosts. Almost all enterprises, despite their different security postures, encounter some malware or PUP in a three year period. We also observe that some industries, especially those related to finance, secure their systems far better than other industries. Fourth, we perform an analysis of PUP economics. For that, we first propose a i novel technique for performing PUP attribution. Then, we use our technique to iden- tify the entities behind three large Spanish-based PUP operations and measure the profitability of the companies they operate. Our analysis shows that in each opera- tion a small number of people manages a large number of companies, and that the majority of them are shell companies. In the period 2013–2015, the three operations have a total revenue of 202.5M e and net income of 23M e. Finally, we observe a sharp decrease on both revenue and income for all three operations starting mid-2014. We conclude that improved PUP defenses deployed by various software and security vendors significantly impacted the PPI market. ii Resumen de la Tesis Doctoral Los programas potencialmente no deseados (PUP) son una categor´ıa de software que, aunque no totalmente malignos, pueden presentar considerables riesgos a la privacidad y seguridad de los usuarios. Existen indicios que la relevancia del PUP ha aumentado rapidamente´ durante los ultimos´ anos,˜ pero la prevalencia del PUP en equipos infor- maticos de consumidores y empresas es desconocida. Ademas,´ hay varios aspectos importantes del PUP tales como sus vectores de distribucion,´ su abuso de la tecnolog´ıa Windows Authenticode, y sus beneficios economicos´ que siguen siendo desconocidos. En esta tesis se analiza emp´ıricamente y sistematicamente,´ en amplitud y profundidad, el abuso, la prevalencia, la distribucion´ y los beneficios economicos´ del PUP. Esta tesis engloba las siguentes cuatro contribuciones. Primero, presentamos un estudio sistematico´ sobre el abuso del PUP y malware en Windows Authenticode una tecnolog´ıa para firmar digitalmente codigo´ ejecutable. Construimos una infraestructura que clasifica programas como PUP o malware y la usamos para evaluar 356K muestras. Concluimos que la mayor´ıa de las muestras fir- madas son PUP y que el malware normalmente no esta´ firmado. Por otra parte, eval- uamos la eficacia de las defensas usadas por las Autoridades de Certificacion´ (CA) tales como la verificacion´ de identidad y la revocacion.´ Nuestros resultados indican que la verificacion´ de identidad constituye una barrera al malware, pero no afecta al PUP. Las revocaciones de los certificados son m´ınimas tanto en malware como en PUP. Concluimos que las defensas de los CAs no son eficaces para el PUP. Segundo, medimos la prevalencia del PUP en equipos informaticos reales de usuar- ios usando telemetr´ıa de 3.9 millones de sistemas. Detectamos PUP en 54% de los sistemas en nuestros datos. Adicionalmente, analizamos el ecosistema de servicios comerciales de pago por instalacion´ (PPI) y mostramos que los servicios comerciales PPI desempenan˜ una funcion´ importante en la distribucion´ del PUP. Tercero, presentamos un analisis´ de la seguridad informatica´ en las empresas y medimos la prevalencia del PUP y malware en equipos informaticos´ de empresas. Us- amos la telemetr´ıa de 28K empresas en 67 sectores industriales, con mas´ de 82 millones iii de usuarios en total. Casi todas las empresas, independientemente de sus propiedades, han sido afectadas por algun´ malware o PUP durante el periodo de tres anos.˜ Ademas,´ observamos que algunos sectores industriales, particularmente sectores relacionados con las finanzas, protegen sus sistemas mucho mejor que otros sectores. Cuarto, realizamos un analisis´ economico´ del PUP. Para ello, proponemos una novedosa tecnica´ para realizar atribucion´ del PUP. Usamos nuestra tecnica´ para iden- tificar las entidades detras´ de tres grandes operaciones PUP espanolas˜ y medimos la rentabilidad de sus empresas. Nuestro analisis´ determina que en cada operacion´ hay un pequeno˜ numero´ de personas que controla un gran numero´ de empresas, de las cuales la mayor´ıa son empresas pantallas. En el periodo 2013–2015, las tres operaciones´ tienen ingresos por un total de 202.5M e y beneficios de 23M e. Por ultimo,´ observamos una disminucion´ drastica´ tanto de los ingresos como de los beneficios de las tres opera- ciones desde mediados de 2014. Concluimos que las nuevas defensas desplegadas por grandes empresas han impactado significativamente a los servicios comerciales PPI. iv Contents 1 Introduction1 1.1 Code Signing Abuse .......................... 3 1.2 PUP Prevalence & Distribution in Consumer Hosts.......... 4 1.3 PUP Prevalence in Enterprise Hosts .................. 4 1.4 PUP Economics............................. 5 1.5 Thesis Contributions .......................... 6 1.6 Thesis Organization........................... 7 2 Related Work8 3 Code Signing Abuse 13 3.1 Introduction............................... 13 3.2 Overview ................................ 15 3.2.1 Microsoft Authenticode .................... 16 3.2.2 Authenticode Market...................... 19 3.3 Revoking Timestamped Code ...................... 21 3.4 Approach ................................ 22 3.4.1 Sample Processing ....................... 22 3.4.2 Clustering ............................ 23 3.4.3 PUP classification ....................... 25 3.5 Evaluation ................................ 26 3.5.1 Datasets ............................. 26 3.5.2 Clustering and PUP Classification . 28 3.5.3 Evolution over Time ...................... 29 3.5.4 Authenticode Validation .................... 30 3.5.5 Revocation ........................... 33 3.5.6 Timestamping .......................... 36 3.5.7 Largest Operations ....................... 38 v 3.5.8 Blacklist Coverage....................... 41 3.6 Discussion................................ 41 4 PUP Prevalence & Distribution in Consumer Hosts 43 4.1 Introduction ............................... 43 4.2 Overview and Problem Statement ................... 45 4.2.1 Pay-Per-Install Overview .................... 45 4.2.2 Datasets ............................. 48 4.2.3 Problem Statement ....................... 50 4.3 Identifying PUP Publishers ....................... 51 4.4 Clustering Publishers .......................... 52 4.5 PUP Prevalence ............................. 54 4.6 Classifying Publishers ......................... 56 4.7 PUP Distribution Methods. ....................... 61 4.8 PUP–Malware Relationships ...................... 62 4.9 Domain