A Crawler-Based Study of Spyware in the Web
Total Page:16
File Type:pdf, Size:1020Kb
A Crawler-based Study of Spyware in the Web Alex Moshchuk, Tanya Bragin, Steve Gribble, Hank Levy What is spyware? z Broad class of malicious and unwanted software z Steal control of a PC for the benefit of a 3rd party z Characteristics: z Installs without user knowledge or consent z Hijacks computer’s resources or functions z Collects valuable information and relays to a 3rd party z Resists detection and uninstallation You know it when you see it How do people get spyware? z Spyware piggybacked on popular software z Kazaa, eDonkey z Drive-by downloads z Web page installs spyware through browser z With or without user consent z Trojan downloaders z Spyware downloads/installs more spyware Why measure spyware? z Understand the problem before defending against it z Many unanswered questions z What’s the spyware density on the web? z Where do people get spyware? z How many spyware variants are out there? z What kinds of threats does spyware pose? z New ideas and tools for: z Detection z Prevention Approach z Large-scale study of spyware: z Crawl “interesting” portions of the Web z Download content z Determine if it is malicious z Two strategies: z Executable study z Find executables with known spyware z Drive-by download study z Find Web pages with drive-by downloads Outline z Introduction z Executable file study z Drive-by download study z Summary z Conclusions Analyzing executables z Web crawler collects a pool of executabes z Analyze each in a virtual machine: z Clone a clean WinXP VM z Automatically install executable z Run analysis to see what changed z Currently, an anti-spyware tool (Ad-Aware) z Average analysis time – 90 sec. per executable Executable study results z Crawled 32 million pages in 9,000 domains z Downloaded 26,000 executables z Found spyware in 12.3% of them z Most installed just one spyware program z Only 6% installed three or more spyware variants z Few spyware variants encountered in practice z 142 unique spyware threats Main targets z Visit a site and download a program z What’s the chance that you got spyware? news kids random pirate music and movies wallpapers and screensavers games celebrities blacklisted 0 5 10 15 20 25 30 % of executables that are infected Popularity z A small # of sites have large #of spyware executables: z A small # of spyware variants are responsible for the majority of infections: Types of spyware z Quantify the kinds of threats posed by spyware z Consider five spyware functions z What’s the chance an infected executable contains each function? Keylogger 0.05% Dialer 1.2% Trojan downloader 12% Browser hijacker 62% Adware 88% Example of a Nasty Executable z http://aaa1screensavers.com/ z “Let all your worries melt away into this collection of clouds in the sky – 100% free!” z http://aaa1screensavers.com/free/clouds.exe z Installs 11 spyware programs initially z Includes a trojan downloader; continually installs more spyware z 10 more within first 20 minutes z 12 new items on desktop, 3 browser toolbars z Shows an ad for every 1.5 pages you visit z CPU usage is constantly 100% z No uninstallers z Ad-Aware can’t clean z System stops responding in 30 mins z Restarting doesn’t help z Unusable system and no screensaver! Outline z Introduction z Executable file study z Drive-by download study z Summary z Conclusions Finding drive-by downloads z Evaluate the safety of browsing the Web z Approach: automatic virtual browsing z Render pages in a real browser inside a clean VM z Internet Explorer z Mozilla Firefox z Identify malicious pages z Define triggers for suspicious browsing activity z Run anti-spyware check only when trigger fires Event triggers z Real-time monitoring for non-normal behavior: z Process creation z File events z Example: foo.exe written outside IE folders. z Registry events z Example: new auto-start entry for foo.exe z No false negatives (theoretically) z 41% false positives: z Legitimate software installations z Background noise z Spyware missed by our anti-spyware tool More on automatic browsing z Caveats and tricks z Restore clean state before navigating to next page z Speed up virtual time z Monitor for crashes and freezes z Deciding what to say to security prompts: z “yes” z Emulate user consent z “no” (or no prompt) z Find security exploits Example of a security exploit z http://www.1000dictionaries.com/free_games_1.html Ads Help ActiveX Control <iframe 1> http://www.tribeca.hu/test.php C:\windows\helpctr\tools.htm Check browser and referrer <object 1> inject code <iframe 2> <object 2> z Local help objects bypass security JavaScript; VBscript restrictions; unsecured “local zone” http://www.tribeca.hu/ie/writehta.txt: z Cross-zone scripting vulnerability in GET http://www.tribeca.hu/ie/mhh.exe ActiveX Help allows JavaScript to save as c:\calc.exe inject code into a local help control run Drive-by download results (unpatched Internet Explorer, unpatched WinXP) z Examined 50,000 pages z 5.5% carried drive-by downloads z 1.4% exploited browser vulnerabilities news browser exploits kids with user consent random wallpapers and screensavers celebrities blacklist music and movies games pirate 0 5 10 15 20 25 30 35 % of pages with drive-by downloads Types of spyware z Is drive-by download spyware more dangerous? Drive-by Executables Downloads Keylogger 0.05% 0% Dialer 1.2% 0.2% Trojan Downloader 12% 50% Browser hijacker 62% 84% Adware 88% 75% Is Firefox better than IE? z Repeat drive-by download study with adult 0 Mozilla Firefox celebrity 33 z Found 189 (0.4%) pages with games 0 drive-by downloads kids 0 music 1 z All require user consent news 0 z All are based on Java pirate 132 z Work in other browsers random 0 wallpaper 0 z Firefox is not 100% safe blacklist 23 Total: 189 z However, much safer than IE Summary z Lots of spyware on the Web z 1 in 8 programs is infected with spyware z 1 in 18 Web pages has a spyware drive-by download z 1 in 70 Web pages exploits browser vulnerabilities z Most of it is just annoying (adware) z But a significant fraction poses a big risk z Spyware companies target specific popular content z Most piggy-backed spyware in games & celebrity sites z Most drive-by downloads in pirate sites z Few spyware variants are encountered in practice Conclusion and Future Work z Addressed key questions about spyware z Built useful tools and infrastructure z More details: A Crawler-based Study of Spyware in the Web NDSS06 z Looking forward: z Real-time protection with a trigger-based Web proxy z Automatically detect new spyware z Use triggers as truth z Increase the scale of the study z Study change of spyware over time (see paper!) Questions?.