Automated Cross-Browser Compatibility Testing

Total Page:16

File Type:pdf, Size:1020Kb

Automated Cross-Browser Compatibility Testing Automated Cross-Browser Compatibility Testing ∗ Ali Mesbah Mukul R. Prasad Electrical and Computer Engineering Trusted Systems Innovation Group University of British Columbia Fujitsu Laboratories of America Vancouver, BC, Canada Sunnyvale, CA, USA [email protected] [email protected] ABSTRACT web browsers render web content somewhat differently [18, With the advent of Web 2.0 applications and new browsers, 24, 25, 26]. However, the scope and impact of this problem the cross-browser compatibility issue is becoming increas- has been rapidly growing due to two, fairly recent trends. ingly important. Although the problem is widely recognized First, modern, rich-content web applications have a heavy among web developers, no systematic approach to tackle client-side behavioral footprint, i.e., they are designed to ex- it exists today. None of the current tools, which provide ecute significant elements of their behavior exclusively on the screenshots or emulation environments, specifies any notion client-side, typically within the web browser. Further, tech- of cross-browser compatibility, much less check it automat- nologies such as Ajax [12], Flash, and event-handling for ically. In this paper, we pose the problem of cross-browser dynamic HTML, which support this thick-client behavior, compatibility testing of modern web applications as a `func- are the very aspects in which web browsers differ. tional consistency' check of web application behavior across Second, recent years have seen an explosion in the num- different web browsers and present an automated solution ber of available web browsers. There are nearly 100 different for it. Our approach consists of (1) automatically analyzing web browsers available today [31]. Coupled with the differ- the given web application under different browser environ- ent kinds and versions of operating systems in which these ments and capturing the behavior as a finite-state machine; operate, this yields several hundred different client-side en- (2) formally comparing the generated models for equivalence vironments in which a web application can be used, each on a pairwise-basis and exposing any observed discrepan- with a slightly different client-side rendering of the applica- cies. We validate our approach on several open-source and tion [7, 9]. In the context of the above problem, the key industrial case studies to demonstrate its effectiveness and contributions of this paper are: real-world relevance. • We define the cross-browser compatibility problem and present an in-depth discussion of the issues surround- Categories and Subject Descriptors ing possible solutions to it. D.2.5 [Software Engineering]: Testing and Debugging • We present a systematic, fully-automated solution for cross-browser compatibility testing that can expose a substantial fraction of the cross-browser issues in mod- General Terms ern dynamic web applications. Reliability, Verification • We validate our approach on several open-source as well as industrial case studies to demonstrate its effi- Keywords cacy and real-world relevance. Dynamic analysis, web testing, cross-browser compatibility The rest of the paper is organized as follows. In the next section we discuss the genesis of the cross-browser compati- bility problem and how it plays out in modern web applica- 1. INTRODUCTION tions. We present a formal characterization of the problem Web applications pervade all aspects of human activity to- and put the solutions offered by this paper in that context. day. The web browser continues to be the primary gateway Section 3 reviews related work on cross-browser testing. In channeling the end-user's interaction with the web applica- Section 4 we present our proposed approach for performing tion. It has been well known for some time that different cross-browser compatibility testing, followed by a descrip- tion of the salient implementation aspects of that approach, ∗This author was a visiting researcher at Fujitsu Laborato- ries of America, when this work was done. in Section 5. Section 6 presents an empirical evaluation of our approach on some open-source as well as industrial case studies. In Section 7 we discuss the strengths and weak- nesses of our solution and lessons learnt from our experience. We conclude the paper in Section 8. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies 2. CROSS-BROWSER COMPATIBILITY bear this notice and the full citation on the first page. To copy otherwise, to Problem Genesis. The cross-browser compatibility republish, to post on servers or to redistribute to lists, requires prior specific (CBC) problem is almost as old as the web browser it- permission and/or a fee. ICSE ’11, May 21–28, 2011, Waikiki, Honolulu, HI, USA self. There are several reasons for its genesis and growth Copyright 2011 ACM 978-1-4503-0445-0/11/05 ...$10.00. in recent years. The earliest instances of this problem had A: DOM and/or Trace‐level B: Browser‐level Observable Differences Differences (a) Rendering in IE 8.0. D: Differences detected C: Ideal set of by ppproposed method differences to detect Figure 2: Browser incompatibility landscape. from the client-tier to the server tiers, the response received (b) Rendering in Firefox 3.5. and subsequent content served on the client, thus influenc- Figure 1: CBC issues in an industrial Web 2.0 ap- ing the overall evolution of the web application on the client plication: missing widgets in Firefox. side. Figure 1 shows screen-shots taken from a real en- terprise (proprietary) Web 2.0 application, illustrating this scenario. While the application renders correctly in Inter- its roots in the parallel evolution of different browsers and net Explorer, the Firefox rendering of it is missing the New the lack of standards for web content. Currently, standards User and Delete User widgets, marked with a red eclipse in do exist for basic web content (e.g., HTML 4.0) and for IE. Thus, the human user would be unable to exercise these scripting languages such as JavaScript, JScript, and Ac- features from this screen, thereby completely removing a tionScript (e.g., ECMA-262). However, in their bid to be very important set of behaviors, namely the application ad- inclusive and render websites that do not adhere to these ministrator being able to add and remove users, from the standards (e.g., legacy websites) web browsers typically im- evolution of the web application under Firefox.1 plement their own extensions to the web standards and in Problem Definition. Our working definition of the doing so differ in their behavior. The explosive growth in CBC problem, for the purposes of this paper, is based on the the number of browsers and client-side environments has Venn diagram of Figure 2. The set B depicts those cross- only exacerbated this issue. In addition, modern, dynamic browser differences that can be observed by the human user web applications execute an increasingly bigger fraction of on the web browser, while the set A represents those cross- their functionality in the client-tier i.e., within the end-user's browser differences that are reflected as differences in the web browser, thus further amplifying differences in observed client-side state of one or more screens (e.g., differences in behavior. Cross-browser compatibility is widely recognized the DOM representation or CSS properties of various DOM as an important issue among web developers [26] but hardly elements) or differences in the set of possible traces, i.e., ever addressed directly during the software development pro- alternating sequences of screens and user actions, on the cess. Typically web applications are developed with a single client-side. Further, there is the set of differences B − A target client-side configuration in view and manually tested which can be visually observed by the user but is not re- for a few more, as an after-thought. flected as a difference in the client-side state or in the evolu- Look-and-feel versus Functionality. The common tion of that state. These differences are typically, though not perception is that the CBC problem is confined to purely necessarily, stylistic and stem from different browsers' ren- `look-and-feel' differences on individual screens. While that dering of identical web content (e.g., CSS files). Although was indeed the case for the previous generation of web appli- such differences are important in their own right, they are cations, the problem plays out in a more systemic, functional not the focus of this paper. There is also the set A − B way in modern Web 2.0 applications. Thus, we believe that of differences, which exist in the DOM representation but the issue should be addressed as a generalized problem of do not result in any user-observable differences in behavior. checking functional consistency of web application behav- We argue that these differences are not relevant since the ior across web browsers, rather than purely a `look-and-feel' final metric of incompatibility is that which can be observed issue. This view is also held by several web developers [10]. by the end user. The set C = A \ B, which comprises the The following scenario illustrates how the problem plays differences reflected in both the user-observable behavior as out in modern web applications. The human end-user inter- well as the underlying client-side state, represents the most acts with the web application, through the browser, by view- interesting target for our technique. However, as explained ing dynamically served content and responding with user- below in Section 4, typically the set of differences detected events (e.g., clicks, mouse-overs, drag-and-drop actions) by our technique is a set D, where C ⊆ D ⊆ A.
Recommended publications
  • Opera Software the Best Browsing Experience on Any Device
    Opera Software The best browsing experience on any device The best Internet experience on any device Web Standards for the Future – Bruce Lawson, Opera.com • Web Evangelist, Opera • Tech lead, Law Society & Solicitors Regulation Authority (2004-8) • Author 2 books on Web Standards, edited 2 • Committee member for British Standards Institution (BSI) for the new standard for accessible websites • Member of Web Standards Project: Accessibility Task Force • Member of W3C Mobile Best Practices Working Group Web Standards for the Future – Bruce Lawson, Opera.com B.A., Honours English Literature and Language with Drama Theresa is blind But she can use the Web if made with standards The big picture WWW The big picture Western Western Web A web (pre)history • 1989 TBL proposes a project • 1992 <img> in Mosaic beta. Now 99.57% (MAMA) • 1994 W3C started at MIT • 1996 The Browser Wars • 1999 WAP, Web Content Accessibility Guidelines (WCAG) • 2000 Flash Modern web history • 2000-ish .com Crash - Time to grow up... • 2002 Opera Mobile with Small Screen Rendering • 2005 WHAT-WG founded, W3C Mobile Web Initiative starts • 2007 W3C adopts WHAT-WG spec as basis for HTML 5 • January 22, 2008 First public working draft of HTML 5 Standards at Opera • 25 employees work on standards • Mostly at W3C - a big player • Working on many standards • Bringing new work to W3C • Implementing Standards properly (us and you!) (Web Standards Curriculum www.opera.com/wsc) Why standards? The Web works everywhere - The Web is the platform • Good standards help developers: validate; separate content and presentation - means specialisation and maintainability.
    [Show full text]
  • Designing and Implementing the OP and OP2 Web Browsers
    Designing and Implementing the OP and OP2 Web Browsers CHRIS GRIER, SHUO TANG and SAMUEL T. KING, University of Illinois at Urbana-Champaign Current web browsers are plagued with vulnerabilities, providing hackers with easy access to computer systems via browser-based attacks. Browser security efforts that retrofit existing browsers have had lim- ited success because the design of modern browsers is fundamentally flawed. To enable more secure web browsing, we design and implement a new browser, called the OP web browser, that attempts to improve the state-of-the-art in browser security. We combine operating system design principles with formal methods to design a more secure web browser by drawing on the expertise of both communities. Our design philosophy is to partition the browser into smaller subsystems and make all communication between subsystems sim- ple and explicit. At the core of our design is a small browser kernel that manages the browser subsystems and interposes on all communications between them to enforce our new browser security features. To show the utility of our browser architecture, we design and implement three novel security features. First, we develop flexible security policies that allow us to include browser plugins within our security framework. Second, we use formal methods to prove useful security properties including user interface invariants and browser security policy. Third, we design and implement a browser-level information-flow tracking system to enable post-mortem analysis of browser-based attacks. In addition to presenting the OP browser architecture, we discuss the design and implementation of a second version of OP, OP2, that includes features from other secure web browser designs to improve on the overall security and performance of OP.
    [Show full text]
  • Protecting Browser State from Web Privacy Attacks
    Protecting Browser State from Web Privacy Attacks Collin Jackson Andrew Bortz Stanford University Stanford University [email protected] [email protected] Dan Boneh John C Mitchell Stanford University Stanford University [email protected] [email protected] ABSTRACT malicious attackers is critical for privacy and security, yet Through a variety of means, including a range of browser this task often falls by the wayside in the push for function- cache methods and inspecting the color of a visited hyper- ality. link, client-side browser state can be exploited to track users An important browser design decision dating back to Net- against their wishes. This tracking is possible because per- scape Navigator 2.0 [10] is the \same-origin" principle, which sistent, client-side browser state is not properly partitioned prohibits web sites from different domains from interacting on per-site basis in current browsers. We address this prob- with another except in very limited ways. This principle lem by refining the general notion of a \same-origin" policy enables cookies and JavaScript from sites of varying trust- and implementing two browser extensions that enforce this worthiness to silently coexist on the user's browser without policy on the browser cache and visited links. interfering with each other. It is the failure to apply an ap- We also analyze various degrees of cooperation between propriate adaptation of the same-origin principle to all per- sites to track users, and show that even if long-term browser sistent browser state that is the source of the most alarming state is properly partitioned, it is still possible for sites to web privacy leaks.
    [Show full text]
  • Cgi Environment Variables
    Appendix A CGI ENVIRONMENT VARIABLES The original specification for the Common Gateway Interface (CGI) established a list of environment variables, or aspects of the server and of the users connec- The complete CGI specification, http:// tion that would always be made known to CGI scripts. These variables remain a hoohoo.ncsa.uiuc.edu/cgi/ basic part of writing CGI scripts and have also been adopted in other server-side interface.html scripting languages, including server-side includes, ASP, and PHP. This list of the CGI environment variables below is annotated. SERVER_SOFTWARE The name and complete version information of the Web server software. Some server-side software adds itself to SERVER_SOFTWARE string. For example, Apache/1.3.26 (Unix) PHP/4.2.3. SERVER_NAME The servers domain name or IP address. GATEWAY_INTERFACE The version of the CGI specification in use. For example, CGI/1.1. CGI scripts are executable programs that can also be run as command-line programs on the server. The script can use the presence of a GATEWAY_INTERFACE environment variable to determine that it is being run as a CGI script. SERVER_PROTOCOL The communications protocol and version number used to request the script. For example, HTTP/1.1. SERVER_PORT Library Technology Reports The port number to which the request was sent. Sites running Web services on ports other than the default of 80 may want scripts to behave differently for users on different ports. REQUEST_METHOD The HTTP method used for the request. For example, GET or POST. A common use of this variable is have one script handle both the job of displaying a form to the user and of handling the forms submission.
    [Show full text]
  • Appendix a the Ten Commandments for Websites
    Appendix A The Ten Commandments for Websites Welcome to the appendixes! At this stage in your learning, you should have all the basic skills you require to build a high-quality website with insightful consideration given to aspects such as accessibility, search engine optimization, usability, and all the other concepts that web designers and developers think about on a daily basis. Hopefully with all the different elements covered in this book, you now have a solid understanding as to what goes into building a website (much more than code!). The main thing you should take from this book is that you don’t need to be an expert at everything but ensuring that you take the time to notice what’s out there and deciding what will best help your site are among the most important elements of the process. As you leave this book and go on to updating your website over time and perhaps learning new skills, always remember to be brave, take risks (through trial and error), and never feel that things are getting too hard. If you choose to learn skills that were only briefly mentioned in this book, like scripting, or to get involved in using content management systems and web software, go at a pace that you feel comfortable with. With that in mind, let’s go over the 10 most important messages I would personally recommend. After that, I’ll give you some useful resources like important websites for people learning to create for the Internet and handy software. Advice is something many professional designers and developers give out in spades after learning some harsh lessons from what their own bitter experiences.
    [Show full text]
  • FOSS Philosophy 6 the FOSS Development Method 7
    1 Published by the United Nations Development Programme’s Asia-Pacific Development Information Programme (UNDP-APDIP) Kuala Lumpur, Malaysia www.apdip.net Email: [email protected] © UNDP-APDIP 2004 The material in this book may be reproduced, republished and incorporated into further works provided acknowledgement is given to UNDP-APDIP. For full details on the license governing this publication, please see the relevant Annex. ISBN: 983-3094-00-7 Design, layout and cover illustrations by: Rezonanze www.rezonanze.com PREFACE 6 INTRODUCTION 6 What is Free/Open Source Software? 6 The FOSS philosophy 6 The FOSS development method 7 What is the history of FOSS? 8 A Brief History of Free/Open Source Software Movement 8 WHY FOSS? 10 Is FOSS free? 10 How large are the savings from FOSS? 10 Direct Cost Savings - An Example 11 What are the benefits of using FOSS? 12 Security 13 Reliability/Stability 14 Open standards and vendor independence 14 Reduced reliance on imports 15 Developing local software capacity 15 Piracy, IPR, and the WTO 16 Localization 16 What are the shortcomings of FOSS? 17 Lack of business applications 17 Interoperability with proprietary systems 17 Documentation and “polish” 18 FOSS SUCCESS STORIES 19 What are governments doing with FOSS? 19 Europe 19 Americas 20 Brazil 21 Asia Pacific 22 Other Regions 24 What are some successful FOSS projects? 25 BIND (DNS Server) 25 Apache (Web Server) 25 Sendmail (Email Server) 25 OpenSSH (Secure Network Administration Tool) 26 Open Office (Office Productivity Suite) 26 LINUX 27 What is Linux?
    [Show full text]
  • Cascading Style Sheet Web Tool
    CASCADING STYLE SHEET WEB TOOL _______________ A Thesis Presented to the Faculty of San Diego State University _______________ In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Science _______________ by Kalthoum Y. Adam Summer 2011 iii Copyright © 2011 by Kalthoum Y. Adam All Rights Reserved iv DEDICATION I dedicate this work to my parents who taught me not to give up on fulfilling my dreams. To my faithful husband for his continued support and motivation. To my sons who were my great inspiration. To all my family and friends for being there for me when I needed them most. v ABSTRACT OF THE THESIS Cascading Style Sheet Web Tool by Kalthoum Y. Adam Master of Science in Computer Science San Diego State University, 2011 Cascading Style Sheet (CSS) is a style language that separates the style of a web document from its content. It is used to customize the layout and control the appearance of web pages written by markup languages. CSS saves time while developing the web page by applying the same layout and style to all pages in the website. Furthermore, it makes the website easy to maintain by just editing one file. In this thesis, we developed a CSS web tool that is intended to web developers who will hand-code their HTML and CSS to have a complete control over the web page layout and style. The tool is a form wizard that helps developers through a user-friendly interface to create a website template with a valid CSS and XHTML code.
    [Show full text]
  • Website Development and Web Standards in the Ubiquitous World: Where Are We Going?
    WSEAS TRANSACTIONS on COMPUTERS Serena Pastore Website development and web standards in the ubiquitous world: Where are we going? SERENA PASTORE INAF-Astronomical Observatory of Padova National Institute for Astrophisics (INAF) Vicolo Osservatorio 5 – 35122 - Padova ITALY [email protected] Abstract: - A website is actually the first indispensable tool for information dissemination, but its development embraces various aspects: the choice of Content Management System (CMS) to implement production and the publishing process, the provision of relationships and interconnections with social networks and public web services, and the reuse of information from other sources. Meanwhile, the context where web content is displayed has dramatically changed with the advent of mobile devices that are very different in terms of size and features, which have led to a ubiquitous world made of mobile Internet-connected devices. Users’ behavior on mobile web devices means that developers need new strategies for web design. The paper analyzes technologies, methods, and solutions that should be adopted to provide a good user experience regardless of the devices used to visualize the content in its various forms (i.e., web pages, web applications, widgets, social applications, and so on), assuring that open standards are adhered to, and thus providing wider accessibility to users. The strategies for website development cover responsive technology vs. mobile apps, the interconnection with social networks and web services in the Google world, the adoption
    [Show full text]
  • Download Text, HTML, Or Images for Offline Use
    TALKING/SPEAKING/STALKING/STREAMING: ARTIST’S BROWSERS AND TACTICAL ENGAGEMENTS WITH THE EARLY WEB Colin Post A thesis submitted to the faculty at the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Master of Arts in Art History in the College of Arts & Sciences Chapel Hill 2019 Approved by: Cary Levine Victoria Rovine Christoph Brachmann © 2019 Colin Post ALL RIGHTS RESERVED ii ACKNOWLEDGMENTS Many people have supported me throughout the process of writing this thesis—support duly needed and graciously accepted as I worked on this thesis while also conducting research for a dissertation in Information Science and in the midst of welcoming my daughter, Annot Finkelstein, into the world. My wife, Rachel Finkelstein, has been steadfast throughout all of this, not least of which in the birth of our daughter, but also in her endless encouragement of my scholarship. All three of my readers, Cary Levine, Victoria Rovine, and Christoph Brachmann, deserve thanks for reading through several drafts and providing invaluable feedback. Cary has also admirably served as my advisor throughout the Art History degree program as well as a member of my dissertation committee. At times when it seemed difficult, if not impossible, to complete everything for both degrees, Cary calmly assured me that I could achieve these goals. Leading the spring 2019 thesis writing seminar, Dr. Rovine helped our whole cohort through to the successful completion of our theses, providing detailed and thoughtful comments on every draft. I also deeply appreciate the support from my peers in the course.
    [Show full text]
  • Designing Beneath the Surface of the Web Sarah Horton Dartmouth College 6224 Baker/Berry Hanover, NH 03755 USA 603 646-1087 [email protected]
    Designing Beneath the Surface of the Web Sarah Horton Dartmouth College 6224 Baker/Berry Hanover, NH 03755 USA 603 646-1087 [email protected] ABSTRACT clarity influence how well software can read and interpret the At its most basic, the web allows for two modes of access: visual source code. Nonvisual web access can be improved by applying and non-visual. For the most part, our design attention is focused the following guidelines for source code design. on making decisions that affect the visual, or surface, layer — Shneiderman defines universal usability as an approach to design colors and type, screen dimensions, fixed or flexible layouts. that is focused on “enabling all citizens to succeed in using However, much of the power of the technology lies beneath the information and communication technologies to support their surface, in the underlying code of the page. There, in the unseen tasks” [18]. A focus on page code design improves the universal depths of the page code, we make decisions that influence how usability of web pages by addressing access challenges in a well, or poorly, our pages are read and interpreted by software. In variety of contexts. For instance, the small viewport on mobile this paper, we shift our attention beneath the surface of the web devices presents many of the same challenges as nonvisual access. and focus on design decisions that affect nonvisual access to web This paper concludes with a discussion of how these guidelines pages. can be applied to improve web access for mobile users. Categories and Subject Descriptors 2.
    [Show full text]
  • NIST SP 800-28 Version 2 Guidelines on Active Content and Mobile
    Special Publication 800-28 Version 2 (Draft) Guidelines on Active Content and Mobile Code Recommendations of the National Institute of Standards and Technology Wayne A. Jansen Theodore Winograd Karen Scarfone NIST Special Publication 800-28 Guidelines on Active Content and Mobile Version 2 Code (Draft) Recommendations of the National Institute of Standards and Technology Wayne A. Jansen Theodore Winograd Karen Scarfone C O M P U T E R S E C U R I T Y Computer Security Division Information Technology Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899-8930 March 2008 U.S. Department of Commerce Carlos M. Gutierrez, Secretary National Institute of Standards and Technology James M. Turner, Acting Director GUIDELINES ON ACTIVE CONTENT AND MOBILE CODE Reports on Computer Systems Technology The Information Technology Laboratory (ITL) at the National Institute of Standards and Technology (NIST) promotes the U.S. economy and public welfare by providing technical leadership for the nation’s measurement and standards infrastructure. ITL develops tests, test methods, reference data, proof of concept implementations, and technical analysis to advance the development and productive use of information technology. ITL’s responsibilities include the development of technical, physical, administrative, and management standards and guidelines for the cost-effective security and privacy of sensitive unclassified information in Federal computer systems. This Special Publication 800-series reports on ITL’s research, guidance, and outreach efforts in computer security and its collaborative activities with industry, government, and academic organizations. National Institute of Standards and Technology Special Publication 800-28 Version 2 Natl. Inst. Stand. Technol. Spec. Publ.
    [Show full text]
  • Fingerprinting Information in Javascript Implementations
    Fingerprinting Information in JavaScript Implementations Keaton Moweryy, Dillon Bogenreif*, Scott Yilek*, and Hovav Shachamy yDepartment of Computer Science and Engineering *Department of Computer and Information Sciences University of California, San Diego University of St. Thomas La Jolla, California, USA Saint Paul, Minnesota, USA ABSTRACT employ additional means to identify the user and the sys- To date, many attempts have been made to fingerprint users tem she is using. Hardware tokens are appropriate in certain on the web. These fingerprints allow browsing sessions to be settings, but their cost and inconvenience form a barrier to linked together and possibly even tied to a user's identity. ubiquitous use. For such sites, browser and user fingerprints They can be used constructively by sites to supplement tra- form an attractive alternative to hardware tokens. On the ditional means of user authentication such as passwords; and (reasonable) assumption that typical users will usually log they can be used destructively to counter attempts to stay in from a single machine (or, at most, a handful), JavaScript anonymous online. fingerprinting allows sites to harden passwords against ac- In this paper, we identify two new avenues for browser fin- count hijacking with a minimum of user inconvenience. gerprinting. The new fingerprints arise from the browser's Destructively, fingerprints can also be used to identify JavaScript execution characteristics, making them difficult and track users, even when those users wish to avoid be- to simulate or mitigate in practice. The first uses the innate ing tracked. The most familiar browser fingerprinting tech- performance signature of each browser's JavaScript engine, nology is the cookie, a client-side datastore partitioned by allowing the detection of browser version, operating system domain and path.
    [Show full text]