<<

Empirical Studies of JavaScript- based Web Applicaon Reliability

Karthik Paabiraman1 Frolin Ocariza1 Kark Bajaj1 Ali Mesbah1 Benjamin Zorn2

1 University of Brish Columbia (UBC), 2Microso Research (MSR)

My Research

• Building reliable and secure soware applicaons

• Compiler & runme techniques for error resilience – Paroning data for differenal resilience [ASPLOS’11] – Error detecon in different programs [DSN’12][DSN’13] – Fault Injecon techniques and tools [DSN’14][ISPASS’14]

• This tutorial – Reliability of modern web applicaons (Part 1) – Tools for building robust web applicaons (Part 2)

Copyright: Karthik Paabiraman, 2014 Web 2.0 Applicaons

Copyright: Karthik Paabiraman, 2014 Web 2.0 Applicaon: Amazon.com

Menu Amazon’s Third party Search bar Web 2.0 applicaons allow rich UI funconality within a single web page own ad gadget ad Funcon

Copyright: Karthik Paabiraman, 2014 Modern Web Applicaons: JavaScript

• JavaScript: Implementaon of ECMAScript standard – Client-Side JavaScript: used to develop web apps • Executes in client’s browser – send messages • Responsible for web applicaon’s core funconality • Not easy to write code in – has many “evil” features

5 Copyright: Karthik Paabiraman, 2014 JavaScript: History Brief History of JavaScript (Source: TomBarker.com)

JavaScript (JS) had to “look like Java” only less so, be Java’s dumb kid brother or boy-hostage sidekick. Plus, I had to be done in ten days or something worse than JS would have happened – (Inventor of JavaScript)

Copyright: Karthik Paabiraman, 2014 JavaScript: Prevalence • 97 of the Alexa top 100 websites use JavaScript • Thousands of lines of code, oen > 10,000

Lines of code 90000 80000 70000 60000 50000 40000 30000 20000 10000 0 QQ FC2 163 BBC AOL Sina Bing CNN Soso MSN Ebay Zedo Sohu Ifeng ESPN Flickr CNET IMDb Baidu Apple About Sogou Youku Yahoo Nelix Adobe PayPal mail.ru Google Yandex Tumblr go.com 4shared Amazon LinkedIn YouTube Livedoor Weather MySpace NY Times GoDaddy Microso MediaFire Pirate Bay WordPress LiveJournal ImageShack MegaUpload YieldManager

Copyright: Karthik Paabiraman, 2014 JavaScript: “Good” or “Evil” ?

Vs

Eval Calls (Source: Richards et al. [PLDI-2010]) 100000

E 10000 v 1000 a 100 l 10 Real web applicaons do not sck to the “good” parts 1 s 0.1

Copyright: Karthik Paabiraman, 2014 Studies of JavaScript Web Applicaons

Performance and parallelism: Reliability Security and Privacy: JSMeter [Ratanaworabhan-2010], [Yue-2009], [Richards-2009], [Fortuna-2011] ? Gatekeeper[Guarnieri-2009], [Jang-2010]

Goal: Study and improve the reliability of JavaScript web applicaons performance reliability security

Copyright: Karthik Paabiraman, 2014 Does Reliability Maer ? • Snapshot of iFeng.com: Leading media website in China

an error occurred when processing this direcve

Copyright: Karthik Paabiraman, 2014 This Talk

• Movaon and Approach

• Three approaches for measuring JS Reliability – Error Messages [ISSRE 2011] – With F. Ocariza and B.G. Zorn – Bug Reports [ESEM 2013] – With F. Ocariza, K. Bajaj, A. Mesbah – Stack Overflow Reports [MSR 2014] – With F. Ocariza, A. Mesbah

• Conclusion and Next Steps

Copyright: Karthik Paabiraman, 2014 JSER: JavaScript Error Messages

• All excepons thrown are logged to JS console

Mulple excepons

Copyright: Karthik Paabiraman, 2014 JSER: Error Messages Vs. Stac Analysis • No false posives unlike stac analysis

• Capture interacons with third-party code (adversements)

• Capture interacons with the DOM

JSER Vs.

Copyright: Karthik Paabiraman, 2014 JSER: Tools

• Chose 50 web applicaons from Alexa top 100 • Created Selenium tests for normal interacons • Capture JavaScript Errors printed to Firebug

Copyright: Karthik Paabiraman, 2014 JSER: Research Quesons

Do errors occur in web apps How do errors correlate and if so, what categories do with stac and dynamic they fall in ? characteriscs of the app?

How do errors vary by speed of tesng ? Are they all determinisc ?

Copyright: Karthik Paabiraman, 2014 JSER: Method

1. Descripon of error message

2. Line of code corresponding to error

3. Domain number and line number

Two errors are different if any aribute is different

Copyright: Karthik Paabiraman, 2014 JSER: Error Frequencies Results

• Average of 4 disnct error messages for each app – Standard dev: 3 – Max: 16 (Cnet) – Min: 0 (Google)

Total Disnct Errors

18 16 14 12 10 8 6 4 2 0

Copyright: Karthik Paabiraman, 2014 JSER: Error Classificaon Results • 94 % of errors fall into four predominant categories

Distribuon of Error Messages

4% # of Permission Denied Errors 6%

# of Null Excepon Errors

27% # of Undefined Symbol Errors 54% # of Syntax Errors 9% # of Miscellaneous Errors

Copyright: Karthik Paabiraman, 2014 JSER: Research Quesons

Errors occur in web applicaons How do errors correlate (4 per applicaon on average) with stac and dynamic and fall into four categories characteriscs of the app?

How do errors vary by speed of tesng ? Are they all determinisc ?

Copyright: Karthik Paabiraman, 2014 JSER: Effect of Tesng Speed

• Varied tesng speed for replaying events in Selenium • Performed three execuons in each tesng speed

Fast Medium Slow

0 ms 500 ms 1000 ms

Copyright: Karthik Paabiraman, 2014 JSER: Tesng Speed Results (CNN) Error Message (shortened) F F F M M M S S S 1 2 3 1 2 3 1 2 3 Permission Denied for 4 4 4 1 3 3 2 2 3 view.atdmt.com to call on marquee.blogs.cnn.com targetWindow.cnnad showAd is not 0 2 5 0 0 0 0 0 0 a function window.parent.CSIManager is un- 0 0 0 0 0 0 1 1 0 defined

Copyright: Karthik Paabiraman, 2014 JSER: Effect of Tesng Speed • All three tesng modes expose different errors

Total disnct errors Fast Mode Medium Mode Slow Mode

Copyright: Karthik Paabiraman, 2014 JSER: Non-Determinism

• More than 70% of errors: non-determinisc

Total non-determinisc errors

Copyright: Karthik Paabiraman, 2014 JSER: Research Quesons

Errors occur in web applicaons How do errors correlate (4 per applicaon on average) with stac and dynamic and fall into four categories characteriscs of the app?

Error occurrences vary with speed of tesng. About 70% of errors are non-determinisc. Copyright: Karthik Paabiraman, 2014 JSER: Stac/Dynamic Characteriscs

Stac Characteriscs Dynamic Characteriscs Measured using Phoenix & Firebug plugins From Richards et al. [PLDI – 2010] • Alexa Rank • Number of called funcons

• Bytes of JavaScript code • Number of eval calls

• Number of domains • Properes deleted

• Domains containing JS • Object inheritance overridings

Copyright: Karthik Paabiraman, 2014 JSER: Correlaons Summary

Stac Characteriscs Dynamic Characteriscs

• Alexa Rank • Number of called funcons

• Bytes of JavaScript code • Number of eval calls

• Number of domains • Properes deleted

• Domains containing JS • Object inheritance overridings

Copyright: Karthik Paabiraman, 2014 JSER: Research Quesons

Errors occur in web applicaons Errors correlate with Alexa (4 per applicaon on average) rank, no of domains but and fall into four categories not with loc or eval calls

Error occurrences vary with speed of tesng. About 70% of errors are non-determinisc. Copyright: Karthik Paabiraman, 2014 JSER: Implicaons of the Results

– Need to make code robust against other code/scripts – Make sure interacons with DOM are checked

• Testers – Perform integraon tesng to see effects of ads – Need to test at mulple tesng speeds, mulple mes

• Stac analysis tool developers – Target most common classes of errors – Need to model the DOM in the analysis

Copyright: Karthik Paabiraman, 2014 This Talk

• Movaon and Approach

• Three approaches for measuring JS Reliability – Error Messages [ISSRE 2011] – With F. Ocariza and B.G. Zorn – Bug Reports [ESEM 2013] – With F. Ocariza, K. Bajaj, A. Mesbah – Stack Overflow Reports [MSR 2014] – With F. Ocariza, A. Mesbah

• Conclusion and Next Steps

Copyright: Karthik Paabiraman, 2014 Bug Report Study: Goals • What errors/mistakes cause JavaScript faults?

• What impact do JavaScript faults have?

Bug Report Study of twelve popular, Open Source JavaScript Applicaons

Copyright: Karthik Paabiraman, 2014 Bug Report Study: Objects

Eight JavaScript Web Applicaons

Four JavaScript Libraries

Copyright: Karthik Paabiraman, 2014 Bug Report Study : Method

• Collect bug reports from bug repositories – Focus on bugs that are marked fixed to avoid spurious bugs – Organized into a uniform format (XML file)

Filter out Pick the first Search for all reports that 30 reports and bug reports are not analyze them that have the marked “fixed” manually to word OR the fault determine “JavaScript” does not cause/impact involve JS

Copyright: Karthik Paabiraman, 2014 Bug Report Study: Quesons

• RQ1: What types of JavaScript faults occur in web apps?

• RQ2: What is the impact of JavaScript faults ?

• RQ3: How long does it take to fix a JavaScript fault?

• RQ4: Are JavaScript faults browser-specific ?

Copyright: Karthik Paabiraman, 2014 Bug Report Study: Categories

Incorrect Method Parameter Fault: Unexpected or invalid value passed to JS method or assigned to JS property

DOM-Related Fault: The method is a DOM API method - Account for around two-thirds of JavaScript Faults

Copyright: Karthik Paabiraman, 2014 Bug Report Study: DOM

body head

table div p script

Text: tr p “Hello world”

Want to retrieve this element

35 Copyright: Karthik Paabiraman, 2014 Bug Report Study: DOM-Related Faults

JavaScript code: var x = document.getElementById(“elem”);document.getElementById(“elme”);

Will return null Inexistent ID

div

DOM:

id: elem

36 Copyright: Karthik Paabiraman, 2014 DOM-Related Fault: Example var elem, retrievedStr = [Retrieved via XHR]; var dotsInStr = retrievedStr.split(".").length; if (dotsInStr == 0) { var prefix = "id_"; elem = $("#" + prefix + retrievedStr); } else { elem = $(retrievedStr); } elem[0].focus()

Copyright: Karthik Paabiraman, 2014 DOM-Related Fault: Example var elem, retrievedStr = [Retrieved via XHR]; var dotsInStr = retrievedStr.split(".").length; if (dotsInStr == 0) { var prefix = "id_"; elem = $("#" + prefix + retrievedStr); } else { elem = $(retrievedStr); } elem[0].focus() Retrieved string via XHR

Copyright: Karthik Paabiraman, 2014 DOM-Related Fault: Example var elem, retrievedStr = [Retrieved via XHR]; var dotsInStr = retrievedStr.split(".").length; if (dotsInStr == 0) { var prefix = "id_"; elem = $("#" + prefix + retrievedStr); } else { elem = $(retrievedStr); } elem[0].focus() Find the number of dots in the string

Copyright: Karthik Paabiraman, 2014 DOM-Related Fault: Example var elem, retrievedStr = [Retrieved via XHR]; var dotsInStr = retrievedStr.split(".").length; if (dotsInStr == 0) { var prefix = "id_"; elem = $("#" + prefix + retrievedStr); } else { elem = $(retrievedStr); } elem[0].focus()

If there are no dots, prepend “id_” to the string and access it via $(). Otherwise, leave it as is, and access it via $(). Copyright: Karthik Paabiraman, 2014 DOM-Related Fault: Example var elem, retrievedStr = [Retrieved via XHR]; var dotsInStr = retrievedStr.split(".").length; if (dotsInStr == 0) { var prefix = "id_"; elem = $("#" + prefix + retrievedStr); } else { elem = $(retrievedStr); } elem[0].focus()

Retrieved string of “editor” would go here even though it has no dots, which would erroneously UNDEFINED cause $() to use selector “editor”, which doesn’t match any elements. EXCEPTION! Copyright: Karthik Paabiraman, 2014 DOM-Related Fault: Example var elem, retrievedStr = [Retrieved via XHR]; var dotsInStr = retrievedStr.split(".").length; if (dotsInStr == 0) { var prefix = "id_"; elem = $("#" + prefix + retrievedStr); } else { elem = $(retrievedStr); } elem[0].focus()

BUG: The assigned value should be retrievedStr.split(“.”).length – 1, as length() always returns at least 1.

Copyright: Karthik Paabiraman, 2014 Bug Report Study: Impact • Impact Types – Based on [ICSE’11] • Type 1 (lowest impact), Type 5 (highest impact) 140

120

100

80 80% of highest

60 impact faults are DOM-related

Number of Bug Reports 40

20

0 Type 1 Type 2 Type 3 Type 4 Type 5 Impact Type All Faults DOM-Related Faults Only

Copyright: Karthik Paabiraman, 2014 Bug Report Study: Fix Times • Triage Time: Time it took to assign or comment on the bug • Fix Time: Time it took to fix the bug since it was triaged

100

90

80 70 All Faults 60 50 DOM-Related Only 40 30 Non-DOM-Related Average # of Days 20 Only 10

0 Triage Time Fix Time

Copyright: Karthik Paabiraman, 2014 Bug Report Study: Browser Specificity

Most JavaScript faults are not browser-specific

2% IE

17% 1% 1% 1% Chrome

Safari

76% Opera

Not browser-specific

Copyright: Karthik Paabiraman, 2014 Bug Report Study: Summary

• Bug report study of 12 applicaons: JS faults – Over 300 bug reports analyzed; only fixed bugs considered

• DOM-related faults dominate JavaScript faults – Responsible for nearly two-thirds of all JavaScript faults – Responsible for 80% of highest impact faults – Take 50% longer me to fix for developers – Majority are not specific to plaorm

• Need robust soluons for DOM-related faults – Fixing, Understanding and wring correct code Copyright: Karthik Paabiraman, 2014 This Talk

• Movaon and Approach

• Three approaches for measuring JS Reliability – Error Messages [ISSRE 2011] – With F. Ocariza and B.G. Zorn – Bug Reports [ESEM 2013] – With F. Ocariza, K. Bajaj, A. Mesbah – Stack Overflow Reports [MSR 2014] – With F. Ocariza, A. Mesbah

• Conclusions and Next Steps

Copyright: Karthik Paabiraman, 2014 StackOverflow: Background

• Stack Overflow – QA website for programmers – Started in 2008 – 4,125,638 quesons asked from Jan’09 to Dec’12 – 500,000+ quesons related to web development

• Quesons directly asked/answered by developers – Followed by discussion in comments

Copyright: Karthik Paabiraman, 2014 StackOverflow: Example

Copyright: Karthik Paabiraman, 2014 StackOverflow: NLP Analysis

• Filter web-related quesons based on tags provided • Analyzing the text provided in the quesons and answers (Latent Drichlet Allocaon)

Data Data Data Collecon Cleaning Processing

Copyright: Karthik Paabiraman, 2014 StackOverflow Datasets

Copyright: Karthik Paabiraman, 2014 RQ1: Categorizaon of topics of discussion

Copyright: Karthik Paabiraman, 2014 RQ2: Temporal trends over me

Copyright: Karthik Paabiraman, 2014 RQ3: Prevalence of web in mobile development

Copyright: Karthik Paabiraman, 2014 RQ4: Technical challenges

Copyright: Karthik Paabiraman, 2014 StackOverflow: Summary of Findings

• Finding 1: Though cross-browser issues dominated in the past, they have declined sharply since 2012.

• Finding 2: DOM and Canvas interacons consistently dominate

• Finding 3: Mobile web applicaon development is on the rise, compared to tradional web appln. development

• Finding 4: Even expert programmers are confused by APIs and documentaon

Copyright: Karthik Paabiraman, 2014 StackOverflow: Implicaons

• Finding 1, 2 (Categorizaon and temporal trends) – Researchers can shi their focus away from cross browser issues to DOM and Canvas related ones.

• Finding 4 (Prevalence of mobile applicaons) – Need beer tools for mobile web development

• Finding 5 (Technical Challenges) – Can guide standardizaon communies to focus on areas that need improvement.

Copyright: Karthik Paabiraman, 2014 This Talk

• Movaon and Approach

• Three approaches for measuring JS Reliability – Error Messages [ISSRE 2011] – With F. Ocariza and B.G. Zorn – Bug Reports [ESEM 2013] – With F. Ocariza, K. Bajaj, A. Mesbah – Stack Overflow Reports [MSR 2014] – With K. Bajaj and A. Mesbah

• Conclusion and Next Steps

Copyright: Karthik Paabiraman, 2014 Conclusions

• Web 2.0 applicaons’ reliability is challenging

• Measure the reliability of web applicaons – [ISSRE’11]: Based on error messages in real web apps – [ESEM’13]: Based on bug reports in web apps – [MSR’14]: Based on StackOverflow quesons

• Need to improve web applicaons’ reliability – Use of empirical data to drive improvements

Copyright: Karthik Paabiraman, 2014

Next Steps (Part 2 of tutorial)

• AutoFlox (with Frolin Ocariza): ICST 2012 - Fault localizaon

• Vejovis (with Frolin Ocariza): ICSE 2014 – Fault Repair

• Clemas (with Saba Alimadi, Sheldon Sequira): ICSE 2014 – Program Understanding

• Dompleon (with Kark Bajaj): ASE’14 – Code compleon

Copyright: Karthik Paabiraman, 2014 AutoFlox [Ocariza – ICST 2012] • AutoFLox: Automac fault localizaon tool for JS – Find origin of the null value – i.e., find the direct DOM access

1 var toggle = 1; 2 var x = “hlelo_”; 3 var y = “world”; 4 var elem = document.getElementById(x + y); 5 var dis = “”; 6 if (toggle == 1) { 7 dis = “block”; 8 } Direct DOM Access 9 else { (This is where the NULL 10 dis = “inline”; value came from) 11 } 12 elem.style.display = dis;

Copyright: Karthik Paabiraman, 2014 Vejovis [Ocariza – ICSE’14] • Vejovis: automac repair of DOM-related errors – Starts at direct DOM access found by AutoFlox – Provide fix suggesons based on common fix paerns

1 var toggle = 1; 2 var x = “hlelo_”; Parameter 3 var y = “world”; 4 var elem = document.getElementById(x + y); 5 var dis = “”; Find “potenal replacements” in DOM 6 if (toggle == 1) { 7 dis = “block”; id 8 } 9 else { id_m id_r 10 dis = “inline”; 11 } 12 elem.style.display = dis; id_r_s id_r_t

Copyright: Karthik Paabiraman, 2014 Clemas [Alimadadi – ICSE’14] • Challenge: Web applicaons are complex, and consist of DOM interacons, AJAX messages and meouts • Difficult to trace the links between events and JS code • Clemas allows users to visualize causal dependencies between events and code, and the DOM

Copyright: Karthik Paabiraman, 2014 Dompleon [Bajaj - ASE’14]

• Provide code-compleon suggesons for programmers for DOM-JavaScript interacon – Based on analysis of JavaScript code and DOM

Copyright: Karthik Paabiraman, 2014 Karthik Paabiraman Ali Mesbah

Kark Bajaj Frolin Ocariza Saba Alimadadi Sheldon Sequira