17th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2019, New York, NY, USA JACoW Publishing ISBN: 978-3-95450-209-7 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2019-WEPHA090

TESTING TOOLS FOR THE IBEX CONTROL SYSTEM T. Löhnert, F. A. Akeroyd, K. Baker, D. Keymer, A. J. Long, C. Moreton-Smith, D. Oram, ISIS Neutron and Muon Source, Didcot, UK J. R. Holt, T. A. Willemsen, K. J. Woods, Tessella, Abingdon, UK

Abstract behaviour of the component being tested, which is espe- cially helpful if the original author has since moved on. At the ISIS Neutron and Muon Source [1], we are cur- Finally, having a well-designed test framework with rently in the process of replacing the legacy SECI control strong simulation capabilities goes a long way to ensure system with the next generation IBEX control system [2] the surprises when testing with a real beamline are kept to based on EPICS [3]. Since IBEX replaces a fully func- a minimum. tional control program, users and developers need to have Robust testing practices raise the confidence of the de- confidence that the new system works as well as the old velopers that new code works as intended, and that exist- one, ensuring that the migration does not cause any undue ing code continues to work when subjected to changes. It disruption to beamline operations. Automated testing is an also helps raise the confidence of users, who are generally indispensable tool to continuously ensure our software is risk-averse that the new control system they are being up to the highest standard of quality. This paper gives an given performs its function satisfactorily. overview of the various types of testing tools we utilize at In the following sections, we will explore a selection of ISIS, with a particular focus on our IOC test framework relevant IBEX components, and the tools in place to en- [4]. This framework has been developed in-house for sure their continued functionality. testing drivers with simulated devices in order to circum- vent testing limitations due to beamline/device availabil- IBEX COMPONENTS ity. INTRODUCTION For the last 5 years, we in the Experiment Control team have been developing the next-generation instrument control system called IBEX for use on beamlines in the ISIS facility. The migration to IBEX is an ongoing pro- cess, with around half of our beamlines currently running on the new system. Being such a large, complex and long- running project make IBEX prone to failure if proper care

is not taken. Consider the following facts: 2019). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI.

• IBEX is a complex ecosystem consisting of many © different services. It is not always obvious what im- pact changes to one component will have on another Figure 1: Overview of (selected) IBEX components. • Over the course of its lifetime, over 30 developers IBEX covers a wide range of responsibilities related to have contributed to the IBEX project, many only for experiment control, from interacting with individual bits a limited duration and sometimes early in their soft- of beamline equipment such as temperature or pressure ware engineering careers controllers, choppers, jaws etc. to managing data collec- • The IBEX system is too complex even for permanent tion and writing the final experiment data file. To this staff to be familiar with every part of it at any one end, it comprises a number of technologies and services. time, let alone have in-depth knowledge of it Figure 1 shows a schematic of the core system. On the • Devices are often fixed on the beamline they are left side, we have the server comprised of various used on. Since beam time is highly valuable, it is of- backend components: ten difficult to get sufficient time to test a new device • A number of devices which are driven through EPICS driver before it is needed in production. This can Input-Output Controllers (IOCs), which expose rele- similarly apply to re-testing if changes/updates are vant values to be read/written to over the network using later made the EPICS Channel Access protocol. Automated testing helps meet all of the above challeng- • The Instrument Control Program (ICP), which deals es. A robust set of tests ensures any unforeseen changes in with the neutron data (collection, processing and writ- behaviour are caught before they are deployed to produc- ing to file) obtained from the Data Acquisition Elec- tion beamlines. Once written, robust tests benchmark any tronics (DAE) newly developed code to forever protect it from regres- • The Block Server, which is a python process that man- sion. Tests can also act as a form of documentation for ages the beamline configuration – i.e. which devices to unknown parts of the system (in addition to other forms control, which sample environment values to log etc. of documentation), as they demonstrate the expected

WEPHA090 Content from this work may be used under the terms of the CC BY 3.0 licence ( Software Technology Evolution 1295 17th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2019, New York, NY, USA JACoW Publishing ISBN: 978-3-95450-209-7 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2019-WEPHA090

On the right hand side, we have various frontend clients set of tests. As such, out of the components above, the that communicate with the server through which users can GUI client, Genie_Python, web dashboard and blockserv- interact with the system: er components all come with substantial unit test suites. Some older parts of the system (such as the ICP) are missing unit tests because they are rarely touched and we generally consider them stable, which means implement- ing unit tests has not become a sufficiently high priority. In the meantime, we use static code analysis tools to keep track of test coverage so that we remain aware of such gaps. For IOCs, we do not have what would strictly be con- sidered unit tests, since they require at least a device to test against. Devices can be arbitrarily complicated, thus mocking their behaviour is a bit more involved.

Figure 2: The IBEX GUI. DEVICE TESTING • The IBEX GUI – this is an /RCP desktop appli- A large part of the work we do is writing IOC drivers cation based on the CS Studio framework [5], and rep- for specific devices used on the beamlines. EPICS pro- resents the primary form of interaction (shown in Fig. vides meta-languages which make it easy to implement 2) device drivers, but these are bespoke and do not come • Genie_Python [6], a library of python commands used with any testing frameworks. Thus, we have developed an to interact with the control system via a command line in-house IOC testing framework which allows you to interface or through running user-written scripts write tests in python that test the behaviour of IOCs against the devices with which they communicate. • The web dashboard, a basic web interface, which pro- Devices are often fixed equipment, meaning we cannot vides a read-only view of the beamline status. confirm our drivers behave correctly without having ac- Note that there are many other services and features that cess to the beamline, which is not always easy or indeed are not included in this selection, for example those that possible. With smaller devices, we are sometimes able to manage device alarms or logging. However, this repre- test against the device in the office, but even so, we usual- sentative set of core components is useful for illustrating ly only have these available for a limited time and there- our use of various different forms of automated testing. fore they are unsuitable for use in a continually running UNIT TESTING automated test suite. Because of this, we simulate the behaviour on the de- Unit tests are a standard software engineering tool most vice side, which we can do at two different levels: The 2019).developers Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI. should already be familiar with. They are used © EPICS record level, and the device level. to test the system “bottom up” by testing the behaviour of small pieces of code (e.g. individual methods) in isola- Record Simulation tion, and should complete in a matter of seconds or less. Record simulation is a concept that is built into the EP- Each unit test should be strongly focused and mock (sim- ICS framework [9]. Every one of our IOCs has a “simu- ulate) functionality outside of that which is being tested, late” field, which, when enabled, will use a set of aliased e.g. a test for a method parsing the content of a text file dummy simulation database records instead of the real should never fail on account of a file system error. ones. These simulated records bypass communication Most of the languages used in our code base come with with a real device and instead use purely virtual values unit testing frameworks (e.g. JUnit in Java, unittest in held in the IOC itself. While it is possible to link simula- Python). We are conscientious that the code we write is tion records logically (e.g. when setting a setpoint on a structured in a way that permits comprehensive testing. parameter, automatically update its readback value, too), We do this for example by using test driven development the EPICS database language is too restrictive to effec- [7], a coding approach where one first writes tests, and tively simulate complex device behaviour. Still, record then the code to make them pass, or by conventionally simulation is useful for confirming basic IOC behaviour using a Model-View-Viewmodel pattern [8] for GUI code, without the need to communicate with a device. which makes display logic easier to test by separating it from pure views. Device Simulation Furthermore, all our unit tests follow certain conven- Device emulators allow us to simulate arbitrarily com- tions in terms of test naming and structure. This is helpful plex devices. Most of our emulators are written using the for our developers when trying to understand a previously LeWIS package [10]. LeWIS lets you build complex unknown part of the system, and thus the tests themselves stateful device simulations in python, emulates a variety serve as a form of documentation. of protocols and provides backdoor access to these emula- It has been our policy for several years now to not de- tors to simulate external events such as moving into an ploy any new code that does not come with a reasonable error state after a sensor has been disconnected. Note that

Content fromWEPHA090 this work may be used under the terms of the CC BY 3.0 licence ( 1296 Software Technology Evolution 17th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2019, New York, NY, USA JACoW Publishing ISBN: 978-3-95450-209-7 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2019-WEPHA090

LeWIS does not depend on the IOC test framework, and This is useful as we sometimes have drivers for old neither does the test framework require emulators to be equipment under SECI for which no manuals exist, or written in LeWIS – for instance, we also test drivers for whose manufacturers do not operate any more. our new Beckhoff [11] motion controllers against the It should be said that there is no replacement for testing vendor-provided emulator. an IOC with the real device as unforeseen things can always happen (e.g. mistakes in a device manual on which we base emulator behaviour). However, the IOC test framework still lets us dramatically reduce the risk associated with using a new or changed device driver in a production environment. GUI/SYSTEM TESTING Another system component that requires testing is the of the main IBEX client. A large portion of the features in the GUI provide access to and Figure 3: Schematic of the interaction between emulators, rely on functionality found in the IBEX backend. There- IOCs and the test framework. fore, testing the GUI behaves in the correct way in most Test Case Example cases means testing the entire IBEX system behaves in the correct way, and thus GUI tests often double as sys- Following is an example of a test, which checks that tem tests. disconnecting the sensor of a temperature controller stops Testing the user interface requires input by a user (i.e. it ramping. Figure 3 shows the relationship between the mouse clicking, entering text/values). Thus, originally, components involved. much of our system testing was performed manually by 1. We start the IOC test suite for a temperature control- developers. Running through this set of 100+ tests for a ler. It automatically runs up the relevant device emu- new release proved to be by far the slowest part of de- lator and IOC as defined in the test class. In this case ploying a new version of IBEX. Because of this, today we we only run a single device but test classes may con- use an automated GUI testing tool called Squish [12]. It is tain multiple IOCs / emulators if necessary, e.g. if we worth noting that Squish is not free software, however want to test the interaction between them after surveying various available tools, we found it to be 2. Once it has confirmed the emulator and IOC have the best option for a number of reasons: successfully started up, the framework begins run- • Squish is easy to use in terms of its API and by provid- ning the test case. ing functionality to generate test cases from recorded 3. As part of the test case, it instructs the IOC to start user interaction ramping up the temperature via channel access • 4. It checks that the emulated device has gone into a Test cases can be written in python, a language most of 2019). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI. “ramping state” our developers are already familiar with © • 5. The test procedure now tells the emulator via the Squish is highly extensible as tests interact with the backdoor to act as if the sensor has been disconnect- application on the OS level, and are therefore inde- ed pendent of the language in which the UI is written 6. Finally, we assert with both LeWIS and the IOC that Automating these tests mean they can run frequently the device has stopped ramping. without requiring any intervention by the developers. This 7. Once completed, the test framework runs the test gives us confidence that functionality across the entire suite again in record simulation mode (if applicable). IBEX stack is working correctly at any given moment. We can specify for each test whether we want to run In addition to speeding up releases and saving develop- it in record or device simulation mode, or both. er time, this tool has helped us find bugs which require At the time of writing, we have emulators for around 60 testing that would not be feasible or at least extremely different devices. Depending on the device, some have tedious to perform manually, such as repeatedly going comprehensive test suites with dozens of test cases, while through a specific workflow to trigger a rare race condi- some may simply check that an IOC successfully boots tion or to investigate the source of a memory leak. and can talk to the emulator. One may think that once a We also have another set of system tests, which test the driver has been written and confirmed to work, there is no IBEX stack by executing Genie_Python commands in- need to repeatedly run these tests, however in reality they stead of interacting with it through the GUI. Both cases have frequently flagged up broken functionality, e.g. are again aided by simulation – since we do not have real when a library an IOC depends on has been changed. live neutron data available when running these tests, they Another use of the IOC test framework is to reverse en- are run with a simulated version of the data acquisition gineer drivers from the old SECI experiment control sys- component, which provides dummy data. tem. Starting with the old driver, we write an emulator that behaves as that driver expects. We can then write and validate an IOC against the emulator created in this way.

WEPHA090 Content from this work may be used under the terms of the CC BY 3.0 licence ( Software Technology Evolution 1297 17th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2019, New York, NY, USA JACoW Publishing ISBN: 978-3-95450-209-7 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2019-WEPHA090

CONTINUOUS INTEGRATION Further, we still have some system tests that are yet to be automated. This is due to limitations on available effort For all the virtues of automated tests, they are only use- and having to balance this work with other tasks, taking ful if they are run frequently. We have set up a continuous into account the risk and effort associated with running integration system using Jenkins [13], which runs our these tests manually only. However, we do occasionally various test suites on one of our dedicated build servers dedicate a day of development to reducing technical debt either stand-alone or in the case of unit tests, as part of the in a specific area of the project – automating the remain- build of the component they are testing. ing manual system tests is one potential subject for this

activity. Another arena in which we would like to expand testing capabilities is user scripts. These are scripts written using the Genie_Python library, not by the development team, but by facility users in order to automate their experi-

ments. Inevitably, mistakes sometimes creep into these Figure 4: The IBEX build status board. scripts, which can on occasion lead to lost beam time, e.g. The build for a component is run every time it is modi- an experiment is being run over night with the wrong fied, and additionally once every night. We have a status parameters. The avenue we are exploring to prevent such board for all the builds in the IBEX project (see Fig. 4), occurrences is by providing a “dry run” option for user which is displayed on a screen in our office so that we as scripts, which simulates the actions in the script (e.g. just a team may monitor it constantly. We also explicitly look printing them out instead of performing them), which at the board as part of our daily stand-up routine, to dis- only takes a few moments after which users can confirm cover when and why builds are breaking, so that we may their correctness before committing to running the actual fix the issues as soon as possible. script.

CONCLUSION Other Utilities To summarize, automated testing at various levels has Continuous integration is not limited to running con- held a multitude of benefits to the long-term health of the ventional tests on a codebase somewhere. For example, IBEX project: we use Jenkins to run a number of sanity checks on other • It helps reduce lost beam time as drivers are tested parts of our system, such as: before being used in production, letting developers re- • Configuration checks: The configuration checker runs a solve issues before they arise set of validity tests on beamline configurations, and has • It speeds up the release cycle as time spent on manual helped us find issues such as invalid IOC settings system testing is kept to a minimum. meaning the driver would not have run properly, or du- • 2019). Any distribution of this workplicate must maintain attribution to the author(s), title of the work, publisher, and DOI. items in configurations which could have result- Regularly running the full test suite increases confi- © ed in unpredictable behaviour dence in the system • • Repository checks: These make sure that our git reposi- It prevents regression as broken functionality becomes tories are properly sanitised, and have caught errors in visible immediately • the past where submodules were not properly linked, It helps document the code as tests demonstrate what meaning new code would have failed to be deployed behaviour is expected • • Wiki Checks: These tests help ensure our online docu- Testing methodology can be extended to sanitize other mentation is correct, e.g. by checking words are spelled parts of the project beyond purely code correctly or URLs contained within are valid REFERENCES FUTURE WORK [1] ISIS Pulsed Neutron and Muon Source, Developing robust tests is equally as important as de- http://www.isis.stfc.ac.uk veloping the IBEX codebase itself, and as such we are [2] K. V. L. Baker et al., “IBEX: Beamline Control at ISIS continuously striving to improve our testing infrastruc- Pulsed Neutron and Muon Source”, presented at the 17th ture. A lot of the processes and tools described in this Int. Conf. on Accelerator and Large Experimental Control Systems (ICALEPCS'19), New York, NY, USA, Oct. paper have been introduced over the course of the project 2019, paper MOCPL01, this conference. as a way to manage arising issues by improving maintain- [3] EPICS Control System Framework, ability and robustness. While new or changed code writ- ten for the project today is generally supported by solid http://www.epics-controls.org testing, coverage is still far from perfect, particularly in [4] IOC Test Framework, older parts of the system. We use tools to analyse code http://www.github.com/ISISComputingGroup/EPI coverage in order to identify parts of the system that re- CS-IOC_Test_Framework quire work in this regard. Improving our unit test cover- [5] Control System Studio, age is one of the most straight-forward ways we have to http://www.controlsystemstudio.org increase robustness of the whole system.

Content fromWEPHA090 this work may be used under the terms of the CC BY 3.0 licence ( 1298 Software Technology Evolution 17th Int. Conf. on Acc. and Large Exp. Physics Control Systems ICALEPCS2019, New York, NY, USA JACoW Publishing ISBN: 978-3-95450-209-7 ISSN: 2226-0358 doi:10.18429/JACoW-ICALEPCS2019-WEPHA090

[6] Genie_Python, http://www.github.com/ISISComputingGroup/gen ie_python [7] Beck, Kent. Test-driven development: by example. Addi- son-Wesley Professional, 2003. [8] Garofalo, Raffaele. “Building enterprise applications with Windows Presentation Foundation and the model view ViewModel Pattern”. Microsoft Press, 2011. [9] Wright, R. M., D. M. Kerstiens, G. D. Vaughn, and R. E. Weiss. “EPICS simulation tools for control system devel- opment.” No. LA-UR-94-2781; CONF-9408125-18. Los Alamos National Lab., NM (United States), 1994 [10] LeWIS – Let’s Write Intricate Simulators, http://www.github.com/ess-dmsc/lewis [11] Beckhoff, http://www.beckhoff.co.uk [12] Squish, http://www.froglogic.com/squish [13] Jenkins, http://www.jenkins.io

2019). Any distribution of this work must maintain attribution to the author(s), title of the work, publisher, and DOI. ©

WEPHA090 Content from this work may be used under the terms of the CC BY 3.0 licence ( Software Technology Evolution 1299