Titus Fortner - Crafting a Test Framework

Titus Fortner: Hello, everyone. My name is Titus Fortner and I'm excited to get to talk to you today about some things in that I'm especially passionate about. Just a quick background about me, writing software is my third career. I started out in the navy as a nuclear engineer on a ballistic missile submarine. I do not recommend to that career path to anyone who enjoys sunlight or regular sleeping hours. When I left the military, I worked in a semiconductor manufacturing company for a few years. I could at least see the sky, but it still wasn't something I was especially excited about.

I decided to pursue something that I thought I could be especially excited about and looked into a career as a software engineer. As with many of us, I think I just happened into the testing focus. On my first day at my first job as a software developer at a small startup, our manager came to us and said, "Hey, I need someone to figure out this thing." I said, "Sure, I'll figure it out," and I've been pretty happy with that decision ever since. I've since worked at five different companies as a software engineer and test.

About a year ago, I joined Sauce Labs as a solution architect. It means I get to help various clients with their testing. One of my primary tasks is to provide automation framework assessments for our customers. This is all to say that I've seen many different approaches to frameworks from many different problem sets. I've gotten to try many different approaches and see what works and what doesn't work and why. I should also say that in addition to the things that I've gotten paid for I'm very active in the open source community. I've been a core contributor to the Selenium Project for two and a half years now, primarily as the maintainer of the Ruby Bindings. Also have been one of the lead developers on the Project for the last three years and I've written or contributed to at least a dozen other open source projects in the testing space.

In this presentation I'm going to talk about the important considerations in crafting your test framework. In the course of working with various colleagues and just being loud and opinionated in general, I found myself in many different disagreements on what constitutes a best practice in this industry. As part of discussing framework components, I'm going to bring up 10 issues that smart people disagree on and give you insight into the things you need to know to make the right decisions for your company.

Let me start by clarifying what I mean by test framework. I have a blog article on my website discussing overloaded or ambiguous terms that annoy me. Often times, people use test framework to refer to just the test runner. I'm using it to mean the entire collection of things that are important for maintaining a large reliable test suite. Speaking of last maintainable test suites, let's get one thing clear upfront with our first issue record and playback tools or specifically the Selenium IDE. Selenium IDE is build as a way to easily record and playback tests https://automationguild.com

using a Firefox extension. If you are listening to this talk do not be fooled. This is not the tool you're looking for.

In the past year, the IDE has stopped working with the latest versions of Firefox. An effort is underway to create a new implementation of the Selenium IDE to provide a record and playback solution specifically supported by the Selenium Project. I'm frustrated by the time and energy this is garnering. The devs pushing for this argue that it is useful for quickly automating repetitive tasks or for quickly reproducing bugs to attach to bug reports. They further argue that it's the responsibility of the user to not abuse a given tool. I see this as somewhat disingenuous.

Essentially, the Selenium project is putting its name on only two products. The first is Web Driver where we expect users to be skilled developers in order to use it. The second option is the IDE where anyone who can install a plugin and manually interact with a site can use it. So which one are manual testers going to turn to when a manager tells them to go learn Selenium to automate their tests?

The IDE has been referred to as training wheels with the idea that users can start using IDE and then learn the skills that they need in order to write web driver tests. Unfortunately, I too often see these training wheels welded on pretty quickly. Further, the kinds of bicycles that you would use training wheels with they're never going to be the ones that you use to compete. If you want an effective test suite, starting out with the Selenium IDE is almost never going to be a good idea.

My big desire for focusing on talks like these is to make it easier for people to get started writing maintainable test suites. If we consider writing straight web driver code as a difficulty of 10 and the IDE as a difficulty of one, how do we create a solution at a level four? How do we help people learn the basic tools that they need in order to be able to quickly start using something that works?

These are the seven components that make up a successful test framework. Let's walkthrough each of these and describe what they are and what their issues are. Wrappers and helpers, this is the important code. This is what makes writing tests easier by providing a higher level interface, grouping actions are always taken together and handling potential before and after conditions. In Ruby, this is handled by open source add-on libraries like Watir or Capybara. As the maintainer of the Watir code I could do an entire talk on Watir but for this presentation I'll stick to just the highlights.

The most important thing for this and really for any framework is automatically waiting for elements to become visible or interactable, an automatic re-looking up of stale elements. Most maintainable frameworks do this in some fashion but often this code is repeated inconsistently throughout the tests instead of being abstracted to just one place. Watir provides higher level handling for frames, so users don't need to explicitly switch into or out of browsing contexts Titus Fortner - Crafting a Test Framework Page 2 of 9

and for interacting with Windows by URL, title or index. Elements get lazy loaded so they could be defined before they are used. Elements are initialized by a specific element type so their attributes and actions are encapsulated to be directly applicable. Watir also provides a bunch of extra ways to locate elements. You want the parent of a div with attribute x present, an app in class name y but not class name z, it's true they want to put together a hash that makes it obvious what is being located. Also you can match anything by either string or regular expression. Finding unique elements when you don't control your application code without resorting to hard to read CSS or Xpath is much easier.

One of the main objections I've heard often from other Selenium core contributors is that the kind of abstractions found in Watir, Capybara, Selenide, Web Driver IO especially things like automated waits and automatic re-looking up of things, remove too much insight from what you get using Selenium commands directly. There is this notion that it is harder to write maintainable tests without understanding the mechanics of Selenium itself. I could not disagree more.

A common complaint I hear from experienced developers is that UI tests take too long to write and from less experienced developers that it is too difficult. I think these are valid complaints and we as a community need to work to address them. When testing a website it is more important to know that the features working as intended than it is to understand how or why the site was implemented the way it was. I shouldn't need to know what the last element to load on a page is or what CSS transition needs to happen before an element gets displayed. The feature needs me to click an element. It doesn't make sense for it to be clicked before it's displayed. It doesn't make sense to click it if it isn't enabled yet.

Selenium takes the correct approach of being a do what I say tool. A good framework builds on conventions and expectations to allow your test code to do what you mean. It is more important that you can quickly and easily ensure your site's functionality is working than that the implementation details of the app are happening in some specific manner.

Data modeling is handling the data that represents objects in the UI. There are three approaches to using data. The first is Grab and Hope. The site has existing users with existing reservations. Grab the first one. Hope it's in a state you need it to be in to test your feature then run your test with it. Obviously, there are a lot of unknowns that can cause a test to fail. Whenever your test finds itself in an unrepeatable condition because you don't control the state it's in, you will have a hard time reproducing false failures.

Second is Fixtures. This is the one that many people push for. Typically, this refers to pre-populating a database with a bunch of seeded values. To use this for Black Box testing users will maintain an ever growing spreadsheet with hard coded values of user information and data that corresponds to this hopefully Titus Fortner - Crafting a Test Framework Page 3 of 9

pre-populated data. This is a losing proposition. It doesn't scale. It's too easy for one test to be written that changes the state of something else when run in the wrong order or at the same time. Again, false failures are really hard to reproduce. Often times databases will get reset and the data doesn't get copied over in the way that it can be used.

The best option is what I call Just in Time. This is creating the data you need, when you need it and using it for exactly what you need. As much as possible, spend the extra time in a test to create the user on the spot. Create the reservation that you know will be in the status it needs to be in. Create the right address object. Maintenance time is much more costly than creation time or run time so find the optimum balance. It makes sense to put in a little extra time at the beginning to make sure that you have your tests as rock solid as possible.

I'm going to start showing off some code examples. I have written the collection of Ruby Gems to perform the various pieces of functionality that I'm talking about here. Here are the links to the repos. A Watir Install at the bottom is what's pulling together all of the various bits of code that allow you to get the full use out of the Watir ecosystem. That it's going to have a scaffolded setup to establish conventions of what pieces belong to what places. This is still a work in progress and if anyone is interested in helping out with this, please let me know.

Many people specify all of their data every time they use it making no contextual distinction for what it means. Usually, you don't care about what the address is so long as you know that your will save it properly. Here we modeled a data that comprises an address. The faker class, it's a Ruby Port of an old Perl library, will generate random default data for each of these types.

What if we do care about specific data? We need to make sure that the website prevents the user from shipping hand soap to Alaska for instance and there needs to be a way to specify what is desired in a contextually meaningful way. In the case of Watir model here, you can create a gamble file with serialized information to store just the parts that are cared about. This code will look into that file, pull out the data stored there and then create an object that includes it but also random data for everything that isn't specified. For instance, you care that it's an Alaska address. You do not care anymore than you did before about what the first name or the last name are.

Configuration data is a special subset of data. I use Watir model for all data that belongs in the same place. Unlike with other model types though, configuration types often need to be set by an environment variable in your CI. For a config model class, you can overwrite the default data that's specified in the code with environment variables of the same name. Your defaults can be varied based on what setup your CI has or if you're running on your local desktop. It can also look into gamble files and provide context that way.

Titus Fortner - Crafting a Test Framework Page 4 of 9

Initialization and Cleanup, this is things that load configurations, initialize the session, close the sessions, report results. This is basic MUnit, JUnit, RSpec, TestNG, Karma tools. There are a lot of different ways to manage these or abstract out the data from these and I only want focus on one issue here.

I see way too many test teams get excited about behavior driven development tools like for all of the wrong reasons. Aslak the author of Cucumber wrote a post several years ago about how his tool is being misused. These tools are not testing tools. They're specifically intended to facilitate a process of collaboration. I like to say that Cucumber is for consultants. If your company is going through a process overhaul and wants to get everyone on the same page with a brand new process and become more agile and improve how everything is working then by all means bring in a consultant to setup a true BDD practice and get everyone invested in using a tool like Cucumber that's an excellent approach. But, if your a group of testers hoping that by adopting this tool, all of a sudden your developers are going to pay attention to you or your manager is now going to be able to start writing tests, you're going to find yourself sorely mistaken.

I've also heard it argued that Cucumber is good to ease people into becoming automation engineers. Start with writing out just the given when-then feature files for other people to implement and then slowly figure out how to write the step files that translate the feature files into code. I'm very against anything that sets up a system that is designed to have multiple classes of competence in the same group and that's all that this approach leads to.

I've seen presentations this past year encouraging people to check the state of their browser to determine which path of a conditional to follow. I've had clients whose page objects are incredibly long because every single method has the same conditional in it. The scenario is that you use Selenium and Appium to run the same test on your browser that you do on your mobile app except the locators are different. You define two different elements and have a method that looks for one and clicks the other if they can't find it. But, what happens when the page just hasn't finished loading the right thing and it can't find it? Your test is now trying to find the wrong element. Your test knows whether it is testing the desktop or mobile at the beginning. There's no need to ask a browser each time. The Ruby solution here is subclassing. In Java because it's strict typing I solve this with Factories. Essentially, your configuration is set for which you're using and then you use that to deterministically choose the correct option.

Retries, I get into heated arguments about this one all of the time and I've taken a very hard line stance on it. It's bad so don't do it unless you have to and you might have to. It's not best practice there are real bugs that only pop up intermittently. There's a bad deploy which leads to a low balance or ultimately using the wrong thing. Making a reservation on the last day of the month doesn't work and you miss it because when you rerun it it randomizes a

Titus Fortner - Crafting a Test Framework Page 5 of 9

different date. It could be really hard to reproduce some errors that are real but minimizing false failures is the most important thing.

It depends on the state of your current test suite. If you can't currently trust your results at all without using retries, you might need to retry but don't be complacent about it. Realize that you're running your car with the check engine light on and you need to have a plan in place to deal with it or you need to be logging your failures to evaluate ones that are consistently flaky but the typically flag them, pull them out of your test suite and make it a priority to fix them instead of retrying.

Another big issue is what your test should look like. Imperative is to describe step by step how to accomplish a task. Declarative expresses the logic without describing its control flow. It's a difference between focusing on the whats versus the hows.

Here's an example of imperative code. It uses the browser directly. It specifies every action. It specifies every bit of data. As opposed to a declarative statement which creates the exact data with contextual information about it. We discussed this earlier about making sure that your data has meaning. In this case we don't know the bobsmith123 and with the password fido is an invalid credential until we read down and see that we're expecting an error from it. It makes more sense to the beginning know what data that we're getting and what expectation we have based on that data. We create the data. We visit the page and then we fill out that with that data. Then we're expecting it to have an error. That's the business logic behind what it is.

The whole reason we have page objects is to abstract a way these implementation details from the business requirements. The goal here is that even without a BDD Gherkin keyword driven suite, your test files should be mostly understandable to non-developers because all of the complicated code implementations and variables are stored somewhere else.

Another big issue I see with themes coming from a manual testing background is they perceive testing as a series of user flows. It's one of the reasons I don't like the term end-to-end testing to developers it means testing the full stack at the same time and to testers they think of it as a start at the beginning and go through everything the user's going to do until they complete the flow. I like to make a distinguish in between these two approaches as a user flow versus a dom to database test because that's more accurately describing what an end-to- end test actually means.

The power of automation here isn't just that it can do the exact same thing a human can do faster and more often. Automation frees you up to reimagine your entire approach. The goal is to make your tests atomic autonomous and short. Now imagine you are the person who gets fired if it's determined that the quality of a release is too poor. You're going to try and figure out your risk points, put together some strategy with a process in place to enforce it. There's Titus Fortner - Crafting a Test Framework Page 6 of 9

no clear cut metric here. There's no percent coverage or number of bugs caught or hours of tests that ensures that no significant problems are going to happen. Whatever process or metrics that you put out there, it still comes down to whether or not you have enough warm fuzzies about what you've done to ensure the quality of the releases.

When we're talking about changing the approach and changing the viewpoint on things, these approaches can be scary so I try to present this an a way to decrease fear while still maintaining your warm fuzzies. To manually test a basic address book site, a most efficient process is going to step through these 19 actions one after another. I've see test suites with a single session that runs for hours, essentially hundreds of tests all strung together. This is completely unmaintainable because any failure at any point in the process is going to cause the test to fail and you'll have to spend time looking into where in that huge long process it failed and figure out how to reproduce it, see if it's actually a problem. In the mean time you have no insight into whether or not the rest of the functionality is working. Say the log out button isn't working, you have no insight into the status of anything else in your application until it's fixed. It's not needed to be able to log out in order to test the rest of these things in an automated fashion.

If each session was atomic and tested exactly one thing you would know exactly what was broken just based on the test name. If we split these tests up into atomic chunks, we make sure that they are autonomous so we can start to run them in parallel so we can decrease the execution time. Additionally, if we break them up into smaller pieces we can now treat the pieces with different priority levels. Do you need to test your links the same way you do logging in or editing an object? A unit test of your routes is probably sufficient for that or maybe you can use an http client to get the link information without needing to load the browser in the first place.

These are the six features that are the most important. Verifying these things needs to be the first priority. As for testing the crucial features without increasing risk, you can leverage what I call transitive testing. Essentially, if you verify that you can somehow get to the same state you are in at the end of one action, then there's no difference between going through the UI for everything and starting at that point for the next test. For instance, you don't need to log out and back in again if you can verify that when you've signed up you are effectively logged in or you don't need to edit a test to be in the correct state to delete it. This is a simple example of one basic set of CRUD actions.

Okay, page objects. I've referred to these several times in this presentation because they're super important. I'm only going to cover one aspect that I have disagreements with people about more often than anything else and that's how they're coupled. Some people like try to make english sentences from a combination of objects and methods. Initialize the log in page, log in the user, go to the new address, submit the form of that address, all in one line. For this to work, each method in the page object needs to return an instance of another Titus Fortner - Crafting a Test Framework Page 7 of 9

page object. Even leaving aside no point or concerns in some languages this is just unnecessary. It's just as descriptive to split these into new lines and initialize the next object rather than losing insight into what page you're on assuming you care about what page you're on. Logging in from a shopping cart page will send you to a different page than logging in from a home page. It makes it harder to debug and maintain if they're using a pattern that only sometimes works.

Now I don't actually use this particular pattern most of the time. I don't have a problem with front loading or coupling. If you don't care how you get to a page so long as you get to the page. You can set your go-to method, your visit method to know what page objects to use or what flow to use to get there. This has the added advantage that if you're going to leverage alternate means of logging in or inputting data, you can drastically speed up many tests at the same time by just changing the code in one place in one page object.

The last thing I want to talk about is API usage. I just released a Gem called UI to API that leverages using APIs specifically for UI testing. This code used to be a lot more complicated when I would write it essentially with this Gem it's really easy. You set a base URL you define an endpoint in a subclass and the Gem will either use a provided Watir model or generate a new one and send to the appropriate json package with a post command to the endpoint. There's lots of flexibility to be able to set headers and add methods and all of this allows you to avoid using the UI from loading the browser in the first place to initialize and verify that your tests were successful. This is essential to being able to speed up your tests. The example test suite with this address book CRUD, I was able to speed up the time on Sauce Labs into the fourth of the time when I was done implementing it with APIs compared to when I ran it in serial.

Finally, we have leveraging APIs. Browsers are hard. Lots of time booting and parsing culls, lots of network things that could go wrong that have nothing to do with the code that you're actually testing. A lot of the times especially when you have a java script framework like React or Angular, the site's UI code and its service code are already independent and connected by an API. There doesn't have to be any difference between what the browser is sending to the API and what an http or rest client sends to the API. This is exactly the way I'm talking about as far as leveraging transitive testing. What the UI sends to the app can be validated by the API. What the API sends to the app can be validated in the UI and both of those tests will be faster than going through the UI for everything.

I just released a UI to API Gem to specifically leverage using these APIs with your tests. The happy path of this code used to be a lot more complicated. Now, you just set a base class and define an endpoint and a subclass. This Gem will use or generate new Watir model object and send the appropriate json with a post command to the right endpoint. There's a lot of flexibility in this code for setting headers and adding methods.

To recap, a lot of these principles end up working together. You can't easily leverage an API if you're using data from fixtures and switching to using an API Titus Fortner - Crafting a Test Framework Page 8 of 9

for your test to make it easier to transition to just in time data approach. The most important thing that I discussed today is having the right helpers and wrappers in the right place to make it easier to write maintenable code. Figuring out how to get your site synchronized makes the biggest difference in flaky tests.

Data modeling, this is another one that I think is vastly underrated as far as making a difference in your test suite but being able to compare data that's going into and out of your application whether that's with a UI or an API.

Configuration data is a special type of data that too often are stored in the code itself. It really should be in it's own place to make it really easy to change in one place the group of code that makes up the configuration.

With page objects, remember this is where all of the implementation details go and be careful how you couple them.

APIs now, this is probably the most advanced item in the talk and I didn't spend much time on it but this one can really be a game changer for the reliability and performance of your test suite.

This is what I've been thinking about in working with clients on and writing code for this past year. Please feel free to reach out to me with questions at anytime. I'm on Twitter and most social media as Titus Fortner. My website is watirtight.com and also I'm including here a link to get an automatic invitation to the Selenium Slack channel. The Slack Channel is mirrored with the IRC channel so you can participate in the communication either way. The Selenium community is very welcoming. Most of the core contributors spend time in that channel helping people from new users to advanced users with all kinds of questions. Thank you very much for your time and I look forward to answering any questions you have.

Titus Fortner - Crafting a Test Framework Page 9 of 9