<<

Predicting Vulnerable Software Components Stephan Neuhaus

Thomas Zimmermann Andreas Zeller Security Advisory 2005-142005-122005-132005-152005-162005-412006-76

Title: WindoLivSSLHeaSpoofingPrivilegeXSSef p"secureedusing owv erflo bookmarksescalationInjectiondo outere wnloadsite"w possible windoindicator Spoofing viaand can w'sDOM security insteal UTF8Functionspoofing prcookiesoper dialogsto objectty SevImpactUnicodewithover erridesovityerla : conHigh: ppingModerateLoversionw windows PrSevoductserity: :HighCritical ,x 2.0Mozilla Suite DescrProductsiption: Fir: efVAEarliermoz_bug_r_a4ariouso wx,ebsite ThunderbirMozilla vschemesersions can Suite inject ,demonstratedof w erFir contenteef roeporx Suite allo tedinto wthated a thatjapopupDescrthevascript: Function could openediptioniption: andcause pr :data: b ototypeItmoz_bug_r_a4Michael y the isanother URLspossible "secur rKraxegr as sitee ession Livfsite"or demonstrates ifefr aeportheeed UTF8lock described targettedbookmarks. icon string se v name tothateral in with aWhenofintheexploitsbugppearvalid the do355161 wnloadpopup sequencestheand givingy shoupdatedcould windo dialoganw cerattack tobew thetrigger tificateand exploitediser kno URLsecurity the awn.details whea ability ould toAn pdialogs b ofattackypass or vbetoerflo the runinstall canerthew wr in who of be ongthe sitecontextknoconspoofmaliciousprotections.vws Theseered tedy ouofb ycode Unicodecouldthear paragainste tiallcurgoroing be rysteal ent crdata.coused tooss v page eringdata,visit Exploitabilitybsitey andphishers thatr themscriptequiring could other with (XSS) to onlwbe site anouldmak yused thatcoulde to theirstealspoofdependotheinjection,verla user cookiesspoofs ppingthe on do whichcontents the windocommonplacelook or attack coulddata morw of. Someersdispla ethebe legitimate abilityused popup actionsyusersed to onto. ma,steal gettheparlikye notticularl thepageclickcredentials noticestring. onIfy thein a windouserintothelinkor sensitiv OS or thewws er openwindo bugge that eon data they wahide con page borcontextfr omvtheerder with ter arbitrarad and . mendr Generaleleess brvatedu.yo bar Thewsersites w privilegessho ebcommon orstatusbar wingcontent perf theorm (for is trueexampleconbisectingcausedestructivv erlocation. inted , eachwhatabout:config)e else actions case wherappears wasone but behalfwhentoprivileged wbee thecan'taof single a LivUIlogged-in ruleef codedialog,eed out was theuserand .be updated,possibilitycon("chrvincedome") the ofb beingy f aeedthe successful spoofingoURLverl ycould trusting attack. text potentiall of of the DOMIs top-mosty run thisnodesWhat ne wother component components likely artoe bevulnerable? vulnerable? arbitrarwindofrom thewy to contentcode click on on windothe the user's "Allow. machinew" or "Open". of the below.

0 Vulnerabilities Vulnerability Version CodeCode Database Archive CodeCode Code

Vulture Predictor

Component Component Component Code Complexity

Look for featurCode es that are invariant under evolution

Language Imports

GUI Database Certificates OS ✘ ✘ ✘ ✘ ✘ ✘ nsIContent.h ✘ ✘ ✘ ✘ ✘ ✘ nsIContentUtils.h ✘ ✘ ✘ ✘ ✘ ✘ nsIScriptSecurityManager.h ✘ ✔ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ nsIPrivateDOMEvent.h ✘ ✘ ✘ ✘ ✘ nsReadableUtils.h ✘ ✘ ✘ ✘ ✘ Research Questions

• How well do imports predict vulnerabilities? • Can imports be used for classification (vulnerable or not) and for regression (number of vulnerabilities)? Case Study: Mozilla

• CVS from January 4, 2007 • 14,368 /C++ files • 134 Security Advisories since January 2005 • Only 424 vulnerable components (4.05%) ⇒ Prediction is challenging

10,452 components in Mozilla

424 vulnerable components 4.05% Mozilla Vulnerabilities security mailnews content extensions nsprpub nss base imap base xslt canvas3d webservice python spellch pr li lib src util src src p src temp doc src soap pro xpco src src tests t libpkix freebl softoken xslt xpath src src md sche w i walle univ sche pkix_pl_nss mpi ecl wi uni ma search cont src src src modu pki sy java o b src svg events met pre ins typ misc pthre pkix incl ssl util certd smim addrbook compose import content src d sr aut w s p include src src outl src content doc src thr cp top uti r xtf xmlterm md p li src src sql s oex base line coo io eud src can b li crmf pki pkcs de xm l ckfw xpcom directory db ef pk11wr local news exten builtins ca pkcs12 mime io glue c-sdk sqlite3 Compiler Utilitie wizard jar cry src src palm ns pki1 src ldap src Code Front Gener windows libxpne libraries clie i md setup uni GUSI certhig b s zli qa bas asn mapi libldap x Primi reflect string typelib exa c old ma db b mac cmd t xptcal x pu sr xpi x suncsdk xp os2 zlib lib m pk si fips pk ce c mork tri modules src s c-sdk b Runtim gc Pack setup crlu blt S base tests ldap src sr Syste sr i oji plugin ds co D src manager jss s tests sr tools sam libraries cli i md C N C Tools org src test s s def build compo Ex D layout JNI obsolete intl editor toolkit xpfe base S thr pr generic style xul C Arr A C c MoreFi uconv libeditor txm components airbag compone bootstra src s base Ac pu ucvlat src uti ucv html base place his s airbag sear boo app widget src com src libimg libfont libpr0n zlib src u u t do pa xre m hi tr ucvcn png jmcge dec s src mac gtk2 text txtsv appshel u st src unichar locale ctl calendar parser tools accessible tables mathml forms libre libp libb src src src libical htmlparser expa trace- codes re src base base softupd src src src windows os2 rdf chardet l s src src p lib li atk bas ht xu src src re d f p src libjar xml s li src libical libic le ms svg pro xpwi ph au jp netwerk test base re dom msgsdk cck gc prin in ht bu gfx base protocol src C expat muc boehm src co g g src http ftp base js base protocol c ps xlib mac theb xlib src js2 driver ib src embedding src xpconnect liveco core qt phot plugin uriloader ipc browser streamco test co re src test os2 oji extha b src ipcd gtk windo be xp sh activex gtk phot bui other-license cache MRJ MRJ e l f src src dns x11sh s 7zst libart_ plu pl fdlib xpr co co web src lib mston view mail pcre code MM powerp java cairo thebe plu pl qt 7zi rdf mac src src com webclient pluggab browser cairo glitz src atk-1. base chro buil dbm sun web compon qa tests src_moz wf components profile src src jpeg src d win s i stu w printin teste mfc w places migrat sr shell pl e publi libpixma xpcom do plu src docshell sto gcon mini publ win fi boo s config caps jsd jni base s sr src web te ja u src chro Mozilla Vulnerabilities js layout security src xul generic base nss xpconnect base lib src src util tree grid src

softoken ssl

cryptohi

forms tables style pki free cert

pk1 smi livecon build mathm cmd printin manager svg html boot ssl

dom netwerk uriloader modules xpcom src protocol base exthandler base plugin string base js http src un base public src content src libpr0n libjar base xbl xul streamc os2 wi be decod io glue src public src document convert mac gif x src p cache dn oji src s about view parser docshell mailnews e expat htmlp base base content caps lib src src se src src inclu addrb mime ne public src src sr templates widget src src pu embedding xpfe gfx intl mac windows components br appshell co cairo sr unich uc events xslt windoww c w src se cairo sr ut sr html src src src view acces rdf hi content documen xslt base xpinstall src p src base src src src b x extensions chro xml svg gtk xlib os2 editor xforms sq document conte libeditor c storage db calen b src html ba src qt phot beos brows ipc canvas gtk2 toolkit content chrome public src te components src xpwi camin webs satch hi au src Distribution of MFSAs Distribution of Bug Reports 300 300 50 50 20 20 5 5 2 2 Number of Components Number of Components 1 1

1 3 5 7 9 11 13 1 3 5 7 9 13 17 24

Number of MFSAs Number of Bug Reports Imports

• 9,066 imports • 79,541 import relations (x imports y) • Takes about five minutes to compute Results soon Support Vector Machines Support Vector Machines

Support Vectors Support Vector Machines Results Now! Experiments

• 40 random splits 6,968 rows in training set, 3,484 rows in validation set • Classification Train SVM, compute recall and precision • Regression Train SVM, compute rank correlation on top 1% • SVM: linear kernel with default parameters R implementation (up to 10GB of main memory) moderately strong correlation (mostly significant at p < 0.01)

(a) Precision and Recall (b) Rank Correlation

2/3 of all vulnerable components detected ● ● 1.0 ● 0.55 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● 0.50 ● ● ● ●●● ●● ● ● ● ● ● ● ● ● 0.6 ● ● ● ● ● ● ● ● ● ● 0.45 ● ● ● ● ●

Precision ●

●●● ● ● 0.4 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.40 ● ● ● ● ● ● 0.2 ● Cumulative Distribution ● ● ● ● ● 0.0 0.35

0.55 0.60 0.65 0.70 0.75 0.2 0.3 0.4 0.5 0.6 0.7

Recall Rank Correlation

2/3 of all vulnerable components detected 45% (about 1/2) of predictions correct Similar Results for Bugs

Packages + Import relationships (Schröter et al, ISESE 2006)

Precision: 66.7% Recall: 69.4%

Binaries + Dependencies (Zimmermann/Nagappan @ Microsoft Research, 2006)

Precision: 64.4% Recall: 75.3% Predicted Actual Component Rank Rank

1 nsDOMClassInfo 3

2 SGridRowLayout 95

3 xpcprivate 6

4 jsxml 2

5 nsGenericHTMLElement 8

6 jsgc 3

7 nsISEnvironment 12

8 jsfun 1

9 nsHTMLLabelElement 18

10 nsHttpTransaction 35