Are They for Everyone?

Issue August 2015 | presented by www.jaxenter.com #46

The digital magazine for enterprise developers Microservices Are they for everyone?

The future of the cloud Databases are where it‘s at

Performance in an API-driven world And why REST APIs are changing the game

The speed of Java 8 lambdas and way more tips, tricks and tutorials ©iStockphoto.com/VikaSuh Editorial

Looking beyond the hype

The hype shows no sign of subsiding. Microservices, These are exactly the kinds of questions we’ll be nding DevOps, Continuous Delivery – the latest trends in IT are answers to at the JAX London conference in October. And truly changing how businesses innovate. And most of all, with JAX London season just around the corner, we’ve asked they are putting the programmer at the heart of good business a selection of our conference speakers to give us a sneak strategies. But there’s a ipside. The gravity of hype can pull preview of what we’ll be learning at the JAX London. From many organisations towards concepts they simply aren’t database testing and smart benchmarking to microservices ready for. reality checks and continuous delivery tips, this is a special So as you’re watching the herd of enterprises ocking issue for anyone looking to update their IT approach. We’re towards DevOpsian IT, it’s good to stand back and have a even going to learn what Parisian history can teach us about think. Are microservices really for everyone? How is it that software architecture! Etsy is making such strides in continuous delivery with a monolithic system? And are there really no bumps on the road to continuous delivery? Coman Hamilton, Editor

Microservices: Storm in a teacup, Testing the Database Layer 24 or teacups in a storm? 4 Dos and Don’ts More hype, anyone? Colin Vipurs Holly Cummins Index Private cloud trends 27 A tale of two teams 6 Financial services PaaS and private clouds: Managing and Smoothing the continuous delivery path monitoring disparate environments Lyndsay Prewer Patricia Hines Fielding, Fowler and Haussmann 7 Intelligent trafﬁ c management in the Network-based architectures: learning from Paris modern application ecosystem 29 Eric Horesnyi The future of traf c management technology Kris Beevers Let’s talk speed 10 Java performance tutorial – How fast are the Java 8 streams? Trade-offs in benchmarking 31 Angelika Langer Cost, scope and focus Aysylu Greenberg A world beyond Java 13 Coding for desktop and mobile with HTML5 and Java EE 7 Common threats to your VoIP system 32 Geertjan Wielenga Five tips to stay secure Sheldon Smith JEP 222 14 JShell, the Java 9 REPL: What does it do? Why reusable REST APIs are changing Werner Keil the game 34 No more custom API mazes MySQL is a great NoSQL 16 Ben Busse Making the right database decisions Aviran Mordo Considering the performance factor in an API-driven world 38 Business intelligence must evolve 17 Milliseconds matter Rethinking how we think about self-service cloud BI Per Buer Chris Neumann The future of cloud computing 22 Database solutions Zigmars Raascevskis

www.JAXenter.com | August 2015 2 Hot or Not

Programming in Schwarzenegger quotes Programming in C can be fun and all. But wouldn’t you rather make your com- mands in the voice of the Terminator? Who wouldn’t want to return a string with “I’ll be back”? “Listen to me very carefully”, “Hasta la vista, baby” and “Do it now” are clearly way more effective than boring old DeclareMethod, EndMethod- Declaration and CallMethod. And frankly, this language is simply far cooler than esoteric alternatives like brainfuck, Hodor-lang and Deadfish. It may not be the youngest language on the scene anymore, but old age certainly hasn’t stopped Arnie from being a bad-ass Terminator.

Java without the Unsafe class It’s being referred to as a “disaster”, “misery” and even the “Javapocalypse”. Oracle has announced the projected removal of the private API used by almost every tool, infrastructure software and high performance library built using Java. Java 9 is showing the door to the sun.misc.Unsafe class. As far as ideas go, this one doesn’t sound so good – at least on paper. It’s particularly the library developers that are annoyed by this change. Numerous libraries like Netty will collapse when Oracle pulls out this Jenga block from the Java release. But then again, as the name of the class suggests, it is “unsafe”. Change is a bitch.

The Stack Overflow trolls Earlier this year, Google was forced to shut down Google Code, its version of GitHub, because the product had become overrun with trolls. Meanwhile, the popular IT Q&A site Stack Overflow has been losing dedicated members as a result of its toxic moderator behaviour, over-embellished points system and systemic hatred for newbies. However, some users are debating whether the Stack Overflow problem extends beyond a minority of trolls, and encompasses a fundamental hostility at the heart of the website’s culture of asking questions. Should novice programmers be afraid of being laughed at when asking beginner questions?

Developer grief Oracle vs. Google We don’t want developers to be sad. Or angry. Or We know it’s Summer. But that doesn’t stop decaffeinated. But conference speaker and developer there from being quite a lot of not-so-hot things Derick Bailey has shown us the rocky road that many happening in IT right now, like the final deci- devs go down when it comes to the human side of sion in the Oracle vs. Google debacle. Most of software development. The five stages of developer the sensible parts of the software industry have grief is the account of the journey that many pro- been hoping for a reduction in the destructive grammers experience when writing code and market- copyright crackdown on Java usage. But with ing their product. Bailey explains how denial, anger, President Obama himself coming down on Or- bargaining, depression and acceptance all leave their acle’s side coupled with the US Supreme Court mark on initially energetic devs, which results in decision against a review of their ruling, there’s them feeling pretty e-motion sick from the ride. Feel- little hope of a happy ending to the Android ings suck. API saga.

www.JAXenter.com | August 2015 3 Microservices

More hype, anyone? Microservices: Storm in a teacup, or teacups in a storm?

Somehow, the buzz surrounding microservices has us believing that every single employee and enterprise must break up their monolith empires and follow the microservices trend. But it’s not everyone’s cup of tea, says JAX London speaker Holly Cummins.

by Holly Cummins I’m part of the team that writes WebSphere Liberty. As a super-lightweight application server, we’re more of an en Folks, we have reached a new phase on the Microservices abler of microservices than a direct consumer. However, we Hype Cycle. Discussion of the microservices hype has over- have experience of a similar internal transformation. We taken discussion of the actual microservices technology. had a legacy codebase that was awesome in many ways, but We’re all talking about microservices, and we’re all talking it was pretty monolithic and pretty big. We needed to break about how we’re all talking about microservices. This article it up, without breaking it. We knew we could become more is, of course, contributing to that cycle. Shall we call it the modular by rebasing on OSGi services, a technology which checkpoint of chatter? shares many characteristics (and sometimes even a name) Let’s step back. I think we’re all now agreed on some basic with microservices. OSGi services allow radical decoupling, principles. Distributing congealed tea across lots of teacups but their dynamism can cause headaches for the unwary. doesn’t make it any more drinkable; microservices are not a What worked for us was writing a brand new kernel, and substitute for getting your codebase in order. Microservices adapting our existing libraries to the new kernel one by one. aren’t the right fit for everyone. On the other hand, microser- Thinking about failure is critical. Imagine a little micro vices do encourage many good engineering practices, such as service teacup, bobbing along in those rough network wa- clean interfaces, loose coupling, and high cohesion. They also ters, with occasional hardware lightning strikes. Not only is encourage practices that are a bit newer, but seem pretty sen- failure a possibility, it’s practically a certainty. Tolerance for sible, such as scalability through statelessness and develop- failure needs to be built in at every level, and it needs to be ment quality through accountability (“you write it, you make exercised at every stage of testing. Don’t get too attached to it work in the field”). any particular service instance. This was one of our biggest Many of these architectural practices are just good software lessons along the way. We made sure our design ensured that engineering. You’ll get a benefit from adopting them – but if code had the framework support to consume services in a you haven’t already adopted them, will you be able to do that robust way, even though they were liable to appear and dis- along with a shift to microservices? A big part of the micro appear. Along the way, we discovered that many of our tests, services debate now centres on the best way to transition to and some of our code, made assumptions about the order in microservices. Should it be a big bang, or a gradual peeling which things happened, or the timing of services becoming of services off the edge, or are microservices something which available. These assumptions were inevitably proved wrong, should be reserved for greenfield projects? usually at 1 am when we were trying to get a green build.

www.JAXenter.com | August 2015 4 Microservices

What I fi nd exciting about the microservices discussion is “A big part of the micro- how it’s making us think about architectural patterns, team organisation, fault tolerance, and the best way to write code and services debate now centres deliver services. That’s got to be a good thing, even if microservices themselves don’t end up being everyone’s cup of tea. on the best way to transi- Holly Cummins is a senior software engineer developing enterprise middleware with IBM WebSphere, and a committer on the Apache Aries pro- tion to microservices.” ject. She is a co-author of Enterprise OSGi in Action and has spoken at Devoxx, JavaZone, The ServerSide Java Symposium, JAX London, GeeCon, and the Great Indian Developer Summit, as well as a number of user groups.

The human implications of microservices Although we tend to talk about the technological implications of microservices, it’s important to think about the human implications, too. Not everyone is comfortable coding Microservices: From dream to cope with services dropping in and out of existence, so you may fi nd you end up jettisoning some people along with to reality in an hour the monolith. Build in a period of adjustment and educa- Hear Holly Cummins speak at the JAX London: Are microservices a tion, and remember to take time to develop new skills as wonder-pattern for rescuing intractably complex applications? Or well as new code. By the time most of our team shifted to are they just a restatement of the software engineering best practi- service-oriented development, we had a beta which clearly ces we all should be following anyway? Or something in between? demonstrated how well the new model worked. The cool- How do they work? How should they be written? What are the ness of where we were going was a good compensation for pitfalls? What are the underpinning technologies? occasional 1am head-scratching over unexpected behaviour.

Advert CD and CI

Smoothing the continuous delivery path A tale of two teams Continuous Delivery is gaining recognition as a best practice, but adopting it and iteratively improving it is challenging.

by Lyndsay Prewer which, in turn, allowed the risk present in a software increment to be more easily identified. To paraphrase Wikipedia, Continuous Delivery is a software Low cost deployment (and rollback): Once a release candidate engineering approach that produces valuable software in short has been produced by the CI system, and the team is happy with cycles and enables production releases to be made at any time. it’s level of risk, one or more deployments will take place, to a Continuous Delivery is gaining recognition as a best practice, variety of environments (normally QA, Staging/Pre-Production, but adopting it and iteratively improving it is challenging. Giv- Production). When practicing Continuous Delivery, it’s typical en the diversity of teams and architectures that do Continuous for these deployments to happen multiple times per week, if not Delivery well it’s clear that there is no single, golden path. per day. A key success factor is thus to minimise the time and This article explores how two very different teams success- effort of these deployments. The microservice team were able to fully practiced and improved Continuous Delivery. Both teams reduce this overhead down to minutes, which enabled multiple were sizeable and mature in their use of agile and lean prac- deployments per day. The monolith team reduced it to hours, in tices. One team chose microservices, Scala, MongoDB and order to achieve weekly deployments. Docker on a greenfield project. The other faced the constraints Regardless of how frequent production deployments hap- of a monolithic architecture, legacy code, .NET, MySQL and pen, the cost and impact of rolling back must be tiny (sec- Windows. onds), to minimise service downtime. This makes rolling back pain-free and not a “bad thing” to do. Patterns for successful practice Monitoring and alerting: No matter how much testing From observing both teams, some common patterns were vis- (manual or automated) a release candidate has, there is al- ible that contributed to their successful Continuous Delivery. ways a risk that something will break when it goes into Pro- Continuous Integration that works: Continuous Integration duction. Both teams were able to monitor the impact of a (CI) is the foundation that enables Continuous Delivery. To be release in near real-time using tools such as Elastic Search, a truly solid foundation though, the CI system must maintain Kibana, Papertrail, Splunk and NewRelic. Having such tools good health, which only happens if the team exercise it and easily available is great, but they’re next to useless unless peo- care for it. Team members need to be integrating their changes ple look at them, and they are coupled to automated alerting regularly (multiple times per day) and responding promptly (such as PagerDuty). This required a culture of “caring about to red builds. The team should also be eliminating warnings Production”, so that the whole team (not just Operations, QA and addressing long running CI steps. These important behav- or Development) knew what “normal” looked like, and no- iours ensure that release candidates can be created regularly, ticed when Production’s vital signs took a turn for the worse. efficiently and quickly. Once this process starts taking hours instead of minutes, Continuous Delivery becomes a burden Conclusion instead of an enabler. This article has highlighted how different teams, with very Automated tests: Managing the complexity of software is different architectures, both successfully practiced Continu- extremely challenging. The right mix of automated tests helps ous Delivery. It’s touched on some of the shared patterns that address the risk present when changing a complex system, have enabled this. If you’d like to hear more about how their by identifying areas of high risk (e.g. lack of test coverage or Continuous Delivery journey, including the different block- broken tests) that need further investigation. When practicing ers and accelerators they faced, and the ever present impact automated testing, it’s important to get the right distribution of Conway’s Law, then I’ll be speaking on this topic at JAX of unit, integration and end-to-end tests (the well documented London on 13-14th October 2015. “test pyramid”). Both teams I worked with moved towards a tear-drop distribution: a very small number of end-to-end tests, sitting Lyndsay Prewer is an Agile Delivery Lead, currently consulting for Equal Ex- on top of a high number of integration tests, with a moder- perts. He focuses on helping people, teams and products become even more awesome, through the application of agile, lean and systemic practices. A ate number of unit tests at the base. This provided the best former rocket-scientist and software engineer, over the last two decades he’s balance between behavioural coverage and cost of change, helped ten companies in two hemispheres improve their delivery.

www.JAXenter.com | August 2015 6 Architecture

Network-based architectures: learning from Paris Fielding, Fowler and Haussmann

For all the great strides that IT is taking to bring us to better futures faster, it turns out that everything we need to know about the development of the web can be learned from the history of urban Paris.

by Eric Horesnyi Outside was even worse: ridden with cybercrime (you get it) in obscure streets, slow with narrow access lines (streets) and As developer or architect, we often have to communicate ne non-existent backbones (boulevards), without a shared proto- concepts of network or system architecture to “decision-mak- col for polling traf c (sidewalk for pedestrians) or streaming ers”. In my case, I have been using the smart city analogy for (subway), and no garbage collection (gotcha). Worse, when twenty years. And to celebrate the 25th birthday of the Web, I a user would go from one page to another, TLP was terri- propose to draw an analogy in depth between a designed city, ble because of these congested un-protocoled lines, but they Paris and the Web. Going through Fielding’s thesis, we will would come home with vi- compare Paris to the Web in terms of constraints, inherited ruses by lack of continuous features, architectural style choices and nally assess whether delivery of patches to the these choices meet the objectives. All these with a focus on a servers (air circulation and transformational period of Paris: 1853–1869 under Hauss- sun down these narrow mann as Architect, with an approach worth applying to the streets hidden by overly many large corporate information systems looking to adopt high buildings). a microservice style, as proposed by Fowler and Newman. To top it off, service Here are the rst two episodes out of seven that we will cover was fully interrupted during the session at the upcoming JAX London. Our audience and redesigned regular- is, by design, either software experts interested in city archi- ly (Revolutions in 1789, tecture to illustrate the beauty of HTTP Rest and Continuous 1815, three Glorieuses Delivery, or anybody wanting to understand the success of the in 1830, 1848), with- Web, and get acquainted to key web lingos and concepts. out backward compatibility. Although users EPISODE I: Who was Haussmann, and his challenge for would bene t from Paris? Eugène Haussmann Eugene Haussmann was a civil servant chosen by Napoleon III to accelerate the modernisation of Paris in 1853. He reigned over Paris architecture for sixteen years. When he took of ce, Paris UX was awful, probably worth less than a star on the Play or Apple store: high dropout rate (cholera stroke every ve years with 10,000 dead each time), servers (houses) were congested (up to 1 person per square metre – US translation: 100 people per townhouse oor), and datacentres would work only intermittently: no power (gas, water), no cooling (sewage). No cache (cellar), no proxy (concierge), no streaming (subway) … a UX nightmare. Congested servers, crowded 19,000 deaths of Cholera in apartments in Paris Paris, 1832 www.JAXenter.com | August 2015 7 Architecture

Pre-Haussmannian street in Paris, “Liberty Leading the People, unsafe, ridden with viruses, no cooling July 28th, 1830” by Delacroix First Page of Code Civil, 1804

these changes in the long run, they defi nitely did not appreci- • Client-Server: the most fundamental concept allowing for a ate the long period of adaptation to the new UI, not to men- network-based architecture: the system is considered as a set tion calls and escalations to a non-existent service desk (votes of services provided by servers (shops, public infrastructure for poor and women). Well, actually, these small access lines services, associations) to clients (citizens). Once roles and made it easy to build DDOS attacks (barricades), a feature the possible interactions are defi ned, each can evolve indepen- business people did not like (Napoleon III). dently: a citizen can grow from student to shop-owners, a grocery store to a juice provider. People living in a building EPISODE II: What properties did Haussmann select for his are not forced to buy their meat from the store in the same system? building, and a store-owner may practice whatever price he Some elements of style that Haussmann ultimately selected wants. This principle of separation of concern allows for were actually imposed upon him by his boss, Napoleon III, citizens freedom to choose, and for entrepreneurial spirit nephew of Napoleon. My intention here is defi nitely not to (ability to create, invest, adapt) to service citizens (with food, write an apology of Napoleon who spread war for years entertainment, social bonds through associations or culture). across Europe, just to point to one feature he introduced in • Stateless: no session state, request from client to server the Process: the Code Civil. The Code Civil details the way must contain all necessary information for server to people can interact together, things they can do and can’t. It process. Whoever the citizen, the server must serve him came out of heated debates during the French Revolution. Its without knowing more about him than other citizens. No publication had an impact similar to the CERN initial paper mixing of genres is allowed between client and server: all on HTTP in 1990. Code Civil rules citizens life the same way citizens are treated equal, and must be serviced the same HTTP is the protocol governing our exchanges on the Web. way for the same request. Citizens cannot be bound to a A fundamental principle in the Code Civil is the separation single service by services creating dependencies or local of concern: it defi nes what is a citizen (Client), and separates monopoly over their services. This separation of concerns its interests and interaction from companies, associations and from one building to another, and one person to a com- the state (servers .com, .org or .gov). Any mixing of interest is pany is a foundation of citizen life even today (usually). unlawful (misuse of social assets), pretty much like HTTP is by design Client Server and Stateless. This also means that a server Another feature Haussmann had to take into consideration cannot treat two clients differently (equality), or two servers was the addressing scheme of Paris, defi ned in 1805, similar unequally (this links to the current debate on Net Neutrality): to our DNS scheme used for HTTP requests:

Napoleon III describing his Low-entry barrier for citizens, Paris numbering scheme, example on Montmartre example of a popular an Haussmanian building mission to Haussmann, 1853 neighbourhood in Paris

www.JAXenter.com | August 2015 8 Architecture

6. Confi gurable: easily modify a building (component) after construction (post- deployment) 7. Reusable: a building hosting an accounting fi rm one day can serve as creamery Paris supporting hypermedia, e. g. Pigalle, the next Paris architecture extended to new technolo- distributed in various neighbourhoods 8. Visible: to provide best gies, e. g. Metro streaming network security and auditability of the system, interactions between components needed • The main backbone, la Seine, defi nes the start of every to be visible (people should see each other in the street) street (our root) 9. Portable: style should work well in other regions, with • Each association (.org), state organization (.gov) and com- other materials and weather conditions pany (.com) can defi ne its scheme within its own domain 10. Reliable: susceptible to failure (no single event could stop water, gas or circulation for citizens) Wanting to build on Code Civil/HTTP, Napoleon III’s ego did not tolerate his capital city to be less civilized than Lon- Looking into the challenges he wanted to address in Paris don or emerging New York. In terms of performance, his through his architectural style, Haussmann weighted each of network-based architecture needed to do the job on: these properties for his evaluation criteria. The main objectives appeared to be: • Network (streets) performance: throughput (fast pedestrians and carriages), small overhead (no need for a citizen to • Low entry-barrier: citizens are not forced to live in Paris, walk the street with a policeman to protect himself), band- and Haussmann wanted to provide them best possible UX width (wide street) to increase adoption. A citizen needed to be able to simply • User-perceived performance: latency (ability to quickly fi nd an address, and a builder to publish a new reference, reach ground fl oor of buildings), and completion (get busi- allowing for the creation of an initial system very quickly. ness done) • Extensibility: a low-entry barrier would help create the • Network-effi ciency: best way to be effi cient is to avoid using modern community Haussmann wanted, and in the long- the street too much. Examples are homeworking (differential term, Paris needed to be ready for changes in its style to data) and news or music kiosks avoiding music only in the adapt to new technologies. Opera or getting news from Le Monde headquarters (cache) • Distributed hypermedia: Paris needed to provide citizens with life experience ranging from music (Opera and kiosk), For Fielding, he would also select his architectural style fi lms (actual theatres), ecommerce (food from Les Halles) against the following metrics: and health (parks). All these experiences were rich in content and would attract many citizens, so much so that they 1. Scalable: make it possible for Paris to grow needed to be distributed across the city. 2. Simple: citizens, civil servants and visitors would need to • Anarchic scalability: once the fi rst set of new neighbour- understand the way the city worked without a user manual hoods would be in place, the city could grow in one direc- 3. Modifi able: ability to evolve in the future through change tion or another, at a very large scale, without the need for 4. Extensible: add new neighbourhood without impacting a centralized control (anarchy) to ensure integrity of the the system entire system. This required each building to ensure its 5. Customizable: specialize a building without impacting others own authentication, and be able to inspect incoming traffi c through fi rewalls (double door, concierge). • Independent deployment: each server (building) or appli- Fowler, Fielding, and Haussmann – cation (neighbourhood) could be deployed independently from the others, without compromising the system. Legacy Network-based Architectures systems (older neighbourhoods that could/should not be Hear Eric Horesnyi speak at the JAX London(Oct. 12–14, 2015). changed, e. g. Notre Dame de Paris) needed to be easily Why is Paris so beautiful, Netfl ix so scalable and REST now a encapsulated to interact and be part of the entire system. standard? This is about analyzing the constraints leading to architecture styles in network-based software as well as buildings. Haussmann invented a scalable model for the city, Fielding established the principles of an internet-scale software architecture Eric Horesnyi was a founding team member at Internet Way (French B2B (REST), and Fowler described in detail how microservices can get ISP, sold to UUNET) then Radianz (Global Finance Cloud, sold to BT). He is a High Frequency Trading infrastructure expert, passionate about Fin- an application to massively scale. tech and Cleantech. Eric looks after 3 bozons and has worked in San Francisco, NYC, Mexico and now Paris.

www.JAXenter.com | August 2015 9 Java ©istockphoto.com/Viorika

Java performance tutorial – How fast are the Java 8 streams? Let’s talk speed

Java 8 brought with it a major change to the collection framework in the form of streams. But how well do they really perform?

by Angelika Langer a hybrid “object-oriented and functional” programming language. The actual motivation for inventing streams for Java Java 8 came with a major addition to the JDK collection was performance or – more precisely – making parallelism framework, namely the Stream API. Similar to collections, more accessible to software developers (see Brian Goetz, State streams represent sequences of elements. Collections support of the Lambda). This goal makes a lot of sense to me, consid- operations such as add(), remove(), and contains() that work ering the way in which hardware evolves. Our hardware has on a single element. Streams, in contrast, have bulk opera- dozens of CPU cores today and will probably have hundreds tions such as forEach(), filter(), map(), andreduce() that ac- some time in the future. In order to effectively utilize the hard- cess all elements in a sequence. The notion of a Java stream ware capabilities and thereby achieve state-of-the-art execu- is inspired by functional programming languages, where tion performance we must parallelize. After all – what is the the corresponding abstraction is typically called a sequence, point to running a single thread on a multicore platform? At which also has filter-map-reduce operations. Due to this simi- the same time, multithread programming is considered hard larity, Java 8 – at least to some extent – permits a functional and error-prone, and rightly so. Streams, which come in two programming style in addition to the object-oriented para- flavours (as sequential and parallel streams), are designed digm that it supported all along. to hide the complexity of running multiple threads. Parallel Perhaps contrary to widespread belief, the designers of the streams make it extremely easy to execute bulk operations in Java programming language did not extend Java and its JDK parallel – magically, effortlessly, and in a way that is acces- to allow functional programming in Java or to turn Java into sible to every Java developer.

www.JAXenter.com | August 2015 10 Java

We measured on an outdated hardware (dual core, no dynamic overclocking) with proper warm-up and all it takes “Our hardware has dozens of CPU to produce halfway reliable benchmark fi gures. This was the cores today and will probably have result in that particular context: hundreds some time in the future.” int-array, for-loop : 0.36 ms int-array, seq. stream: 5.35 ms

The result is sobering: the good old for-loop is 15 times faster So, let’s talk about performance. How fast are the Java 8 than the sequential stream. How disappointing! Years of de- streams? A common expectation is that parallel execution of velopment effort spent on building streams for Java 8 and stream operations is faster than sequential execution with then this? But, wait! Before we conclude that streams are only a single thread. Is it true? Do streams improve perfor- abysmally slow let us see what happens if we replace the int- mance? array by an ArrayList. Here is the for-loop: In order to answer questions regarding performance we must measure, that is, run a micro-benchmark. Benchmark- int m = Integer.MIN_VALUE; ing is hard and error-prone, too. You need to perform a for (int i : myList) proper warm-up, watch out for all kinds of distorting ef- if (i>m) m=i; fects from optimizations applied by the virtual machine’s JIT compiler (dead code elimination being a notorious one) up to Here is the stream-based solution: hardware optimizations (such as increasing one core’s CPU frequency if the other cores are idle). In general, benchmark int m = myList.stream() results must be taken with a grain of salt. Every benchmark is .reduce(Integer.MIN_VALUE, Math::max); an experiment. Its results are context-dependent. Never trust benchmark fi gures that you haven’t produced yourself in your These are the results: context on your hardware. This said, let us experiment. ArrayList, for-loop : 6.55 ms Comparing streams to loops ArrayList, seq. stream: 8.33 ms First, we want to fi nd out how a stream’s bulk operation compares to a regular, traditional for-loop. Is it worth using Again, the for-loop is faster that the sequential stream op- streams in the fi rst place (for performance reasons)? eration, but the difference on an ArrayList is not nearly as The sequence which we will use for the benchmark is an signifi cant as it was on an array. Let’s think about it. Why int-array fi lled with 500,000 random integral values. In this do the results differ that much? There are several aspects to array we will search for the maximum value. Here is the tra- consider. ditional solution with a for-loop: First, access to array elements is very fast. It is an index- based memory access with no overhead whatsoever. In other int[] a = ints; words, it is plain down-to-the-metal memory access. Ele- int e = ints.length; ments in a collection such as ArrayList on the other hand int m = Integer.MIN_VALUE; are accessed via an iterator and the iterator inevitably for(int i=0; i < e; i++) adds overhead. Plus, there is the overhead of boxing and if(a[i] > m) m = a[i]; unboxing collection elements whereas int-arrays use plain primitive type ints. Essentially, the measurements for the Ar- Here is the solution with a sequential IntStream: rayList are dominated by the iteration and boxing overhead whereas the fi gures for the int-array illustrate the advantage int m = Arrays.stream(ints) of for-loops. .reduce(Integer.MIN_VALUE, Math::max); Secondly, had we seriously expected that streams would be faster than plain for-loops? Compilers have 40+ years of experience optimizing loops and the virtual machine’s JIT Workshop: Lambdas and compiler is especially apt to optimize for-loops over arrays with an equal stride like the one in our benchmark. Streams Streams in Java 8 on the other hand are a very recent addition to Java and the This JAX London workshop led by Angelika Langer is devoted to JIT compiler does not (yet) perform any particularly sophisti- the stream framework, which is an extension to the JDK collection cated optimizations to them. framework. Streams offer an easy way to parallelize bulk opera- Thirdly, we must keep in mind that we are not doing much tions on sequences of elements. The Stream API differs from the with the sequence elements once we got hold of them. We classic collection API in many ways: It supports a ﬂ uent program- spend a lot of effort trying to get access to an element and ming style and borrows elements from functional languages. then we don’t do much with it. We just compare two integers, which after JIT compilation is barely more than one assem-

www.JAXenter.com | August 2015 11 Java

The reality check via our benchmark yields a ratio (sequen- “The point to take home tial/parallel) of only 1.6 instead of 2.0, which illustrates the amount of overhead that is involved in going parallel and is that sequential streams how (well or poorly) it is overcompensated (on this particular platform). You might be tempted to generalise these figures and con- are no faster than loops.” clude that parallel streams are always faster than sequential streams, perhaps not twice as fast (on a dual core hardware), as one might hope for, but at least faster. However, this is not true. Again, there are numerous aspects that contribute to the bly instruction. For this reason, our benchmarks illustrate the performance of a parallel stream operation. cost of element access – which need not necessarily be a typi- One of them is the splittability of the stream source. An cal situation. The performance figures change substantially if array splits nicely; it just takes an index calculation to figure the functionality applied to each element in the sequence is out the mid element and split the array into halves. There is CPU intensive. You will find that there is no measurable dif- no overhead and thus barely any cost of splitting. How easily ference any more between for-loop and sequential stream if do collections split compared to an array? What does it take the functionality is heavily CPU bound. to split a binary tree or a linked list? In certain situations you The ultimate conclusion to draw from this benchmark ex- will observe vastly different performance results for different periment is NOT that streams are always slower than loops. types of collections. Yes, streams are sometimes slower than loops, but they can Another aspect is statefulness. Some stream operations also be equally fast; it depends on the circumstances. The maintain state. An example is the distinct() operation. It is point to take home is that sequential streams are no faster an intermediate operation that eliminates duplicates from than loops. If you use sequential streams then you don’t do it the input sequence, i.e., it returns an output sequence with for performance reasons; you do it because you like the func- distinct elements. In order to decide whether the next ele- tional programming style. ment is a duplicate or not the operation must compare to So, where is the performance improvement streams were all elements it has already encountered. For this purpose it invented for? So far we have only compared loops to streams. maintains some sort of data structure as its state. If you call How about parallelization? The point of streams is easy par- distinct() on a parallel stream its state will be accessed con- allelization for better performance. currently by multiple worker threads, which requires some form of coordination or synchronisation, which adds over- Comparing sequential streams to parallel streams head, which slows down parallel execution, up to the extent As a second experiment, we want to figure out how a sequen- that parallel execution may be significantly slower than se- tial stream compares to a parallel stream performance-wise. quential execution. Are parallel stream operations faster than sequential ones? With this in mind it is fair to say that the performance We use the same int-array filled with 500,000 integral val- model of streams is not a trivial one. Expecting that parallel ues. Here is the sequential stream operation: stream operations are always faster than sequential stream operations is naive. The performance gain, if any, depends on int m = Arrays.stream(ints) numerous factors, some of which I briefly mentioned above. .reduce(Integer.MIN_VALUE, Math::max); If you are familiar with the inner workings of streams you will be capable of coming up with an informed guess regard- This is the parallel stream operation: ing the performance of a parallel stream operation. Yet, you need to benchmark a lot in order to find out for a given con- int m = Arrays.stream(ints).parallel() text whether going parallel is worth doing or not. There are .reduce(Integer.MIN_VALUE, Math::max); indeed situations in which parallel execution is slower than sequential execution and blindly using parallel streams in all Our expectation is that parallel execution should be faster cases can be downright counter-productive. than sequential execution. Since the measurements were The realisation is: Yes, parallel stream operations are easy made on a dual-core platform parallel execution can be at to use and often they run faster than sequential operations, most twice as fast as sequential execution. Ideally, the ratio but don’t expect miracles. Also, don’t guess; instead, bench- sequential/parallel performance should be 2.0. Naturally, mark a lot. parallel execution does introduce some overhead for splitting the problem, creating subtasks, running them in multiple threads, gathering their partial results, and producing the overall result. The ratio will be less than 2.0, but it should come close. These are the actual benchmark results: Angelika Langer works as a trainer and consultant with a course curricu- lum of Java and C++ seminars. She enjoys speaking at conferences, among them JavaOne, JAX, JFokus, JavaZone and many more. She is au- sequential parallel seq./par. thor of the online “Java Generics FAQs” and a “Lambda Tutorial & Refer- int-array 5.35 ms 3.35 ms 1.60 ence” at www.AngelikaLanger.com.

www.JAXenter.com | August 2015 12 Java

Coding for desktop and mobile with HTML5 and Java EE 7 A world beyond Java

Full-blown applications programmed for your browser – that’s where it’s at, right now, says JAX Lon- don speaker Geertjan Wielenga. And this should be of some concern to Java developers out there.

by Geertjan Wielenga of those devices have more problems with battery life than others. Responsive design via CSS may not be enough, simply We can no longer make assumptions about where and how because CSS hides DOM elements. It does not prevent the the applications we develop will be used. Where originally loading of resources, meaning that the heavy map technology HTML, CSS and JavaScript were primarily focused on pre- that you intend for the tablet user is going to be downloaded senting documents in a nice and friendly way, the utility of all the same for the mobile user, even though it will not be the browser has exploded beyond what could ever have been shown, thanks to CSS. imagined. And, no, it’s not all about multimedia – i.e., no, it’s not all about video and audio and the like. It’s all about Did you know? full-blown applications that can now be programmed for the Did you know there’s something called “responsive Java browser. Why the browser? Because the browser is every- Script”, which is much more powerful than “responsive where: on your mobile device, on your tablet, on your laptop, CSS”? Did you know that there are a number of techniques and on your desktop computer. you can use when creating enterprise-scale JavaScript appli- Seen from the perspective of the Java ecosystem, this de- cations, including modularity via RequireJS? Did you know velopment is a bit of a blow. All along, we thought the JVM that AngularJS is not the only answer when it comes to JavaS- would be victorious, i.e., we thought the “write once, run cript application frameworks? anywhere” mantra would be exclusively something that we And finally, are you aware of the meaningful roles that as Java developers could claim to be our terrain. To various Java, especially Java EE, can continue to play in the brave extents, of course, that’s still true, especially if you see An- new old world of JavaScript? These questions and concerns droid as Java for mobile. Then you could make the argument will be addressed during my session at JAX London, via a that on all devices, some semblance of Java is present. The range of small code snippets and examples, i.e., you will cer- arguments you’d need to make would be slightly complicated tainly see as much code and technical tips and tricks, as you by the fact that most of your users don’t actually have Java will see slides. Undoubtedly, you will leave the session with installed – i.e., they physically need to do so, or your applica- a lot of new insights and questions to consider when starting tion needs to somehow physically install Java on your user’s your next enterprise-scale applications, whether in Java or in device. Whether you’re a Java enthusiast or not, you need to JavaScript! admit that the reach of the browser is far broader and more intuitively present than Java, at this point. So, how do we deal with this reality? How can you make sure that your next application supports all these different devices, which each have their own specificities and eccen- Geertjan Wielenga is Developer and author at Sun Microsystems and tricities? On the simplest level, each device has its own screen Oracle, working on NetBeans IDE and the NetBeans Platform, speaker at size. On a more complex level, not every device needs to en- JavaOne, Devoxx, JAX London and other international software develop- able interaction with your application in the same way. Some ment conferences, Java and JavaScript enthusiast, JavaOne Rock Star.

www.JAXenter.com | August 2015 13 Java ©istockphoto.com/dinn

JShell, the Java 9 REPL: What does it do? JEP 222

Among the few truly new features coming in Java 9 (alongside Project Jigsaw’s modularity) is a Java Shell that has recently been confirmed. Java Executive Committee member Werner Keil explains how Java’s new REPL got started and what it’s good for.

by Werner Keil Oracle who now owns former EG member Sun), raised some eyebrows among JCP EC members. One concern was that As proposed in OpenJDK JEP 222 [1], the JShell offers a after the JCP had just merged its ME and SE/EE parts into a REPL (Read-Eval-Print Loop) to evaluate declarations, state- single body, developing more and more platform features not ments and expressions of the Java language, together with an as JSRs but JEPs under the OpenJDK would create another API allowing other applications to leverage its functionality. rift between ME/SE (JDK) and EE where most remaining JSRs The idea is not exactly new. BeanShell [2] has existed for over then resided. 15 years now, nearly as long as Java itself, not to mention Device I/O [4], derived from an Oracle proprietary prede- many scripting languages on Scala and Groovy also featuring cessor under Java ME, was already developed as an OpenJDK similar shells already. project. Without a JEP, it seems Oracle at least can also ratify BeanShell (just like Groovy, too by the way) made an at- such projects without prior proposal. The farce around JSR tempt of standardisation by the Java Community Process [3] 310, which neither produced an actual Spec document man- in JSR 274 – a JSR that did not produce any notable output, datory to pretty much all JSRs, nor (according to Co-Spec in spite of the fact that (or perhaps because?) two major com- Lead Stephen Colebourne) comes with a real API similar to panies, Sun and Google, had joined the expert group. Under other SE platform JSRs like Collections, was another example the JCP.next initiative this JSR was declared “Dormant”. of where the JSR should have been withdrawn or declared

An eyebrow-raising approach Adding a new Java feature like this via JEP, rather than wak- Figure 1: JShell ing up the “Dormant” JSR (which anyone could, including arithmetic

www.JAXenter.com | August 2015 14 Java

similar would make great sense too. That’s where temporary “The value for Java ME re- variables quoting $ amounts would be a bit confusing, but maybe the JDK team behind JShell finds other ways to phrase that. mains to be seen, especially Another great example of a Java-powered REPL and expression language for scientific and other arithmetic chal- if down-scaling like De- lenges is Frink [10], named after the weird scientist character in The Simpsons TV series. It answers all sorts of questions, vice I/O is even possible.” starting from date/time or time zone conversions (which java. time aka JSR 310 could certainly be used for, too) or currency conversions like: dormant when the JEP started. It was just meant to rubber- stamp some JDK part by the EC, without the actual result of "600 baht -> USD" the JSR outside of the OpenJDK. Every class has some Javadoc, so that doesn’t really count. Frink provides much more mathematical and physical for- Given Oracle’s strong involvement we are likely to see more mulas, including unit conversion. Based on JSR 363, the JEPs under the OpenJDK. And having a transparent open- upcoming Java Units of Measurement standard [11], this source effort behind these parts of the Java ecosystem is still will be possible in a similar way. With Groovy, co-founder better than a closed environment, so even if it may disenfran- Guillaume Laforge has documented a DSL/REPL for Units chise and weaken some of the JCP (and EC), it is better than of Measurements using JSR 275 a while back [12]. Their solu- no open development at all. tion was used in real-life medical research for Malaria treat- ments. Of course, being written in Java, someone might also Potential uses of the JShell simply expose the actual Frink language and system via JShell Having such a shell in Java is certainly not a bad idea. Re- under Java 9! gardless of its development under Java SE, future versions of Java EE may find a standard shell even more appealing than Werner Keil is an Agile Coach, Java EE and IoT/Embedded/Real Time Java SE. The value for Java ME remains to be seen, espe- expert. Helping Global 500 enterprises across industries and leading IT vendors, he has worked for over 25 years as Program Manager, Coach, cially if down-scaling like Device I/O is even possible. But at SW architect and consultant for Finance, Mobile, Media, Tansport and the very least, IoT devices running Java SE Embedded should Public sectors. Werner is an Eclipse and Apache Committer and JCP mem- clearly benefit. ber in JSRs like 333 (JCR), 342 (Java EE 7), 354 (Money), 358/364 (JCP.next), Java ME 8, 362 (Portlet 3), 363 (Units, also Spec Lead), 365 (CDI 2), 375 (Java EE Se- Windows PowerShell [5] has become a strong preference curity) and in the Executive Committee. for system administration or DevOps, at least on Windows and .NET. This is used by its Play Framework for adminis- trative tasks, while Groovy is used for similar purposes by the Spring Framework, or under the hood of the JBoss Ad- min Shell [6]. Meanwhile, WebLogic Scripting Tool (WLST) emerged from Jython, a Python shell on the JVM. Java EE Reference Implementation GlassFish has an admin shell called asadmin. Being able to tap into a unified Java shell in future versions could certainly make life easier for many Java-based projects, as well as products, developers and ops using them. Other interesting fields of use are domain-specific exten- References sions. Groovy, Scala or other shell-enabled languages (both [1] http://openjdk.java.net/jeps/222 on the JVM and outside of it) are very popular for business or scientific calculations. Based on early impressions with [2] http://www.beanshell.org/ JShell [7] messages like “assigned to temporary variable $3 of [3] http://jcp.org type int” can be quite misleading (Figure 1). [4] http://openjdk.java.net/projects/dio/ In particular the financial domain tends to think of US dol- [5] https://en.wikipedia.org/wiki/Windows_PowerShell lars when they read “$”, so that still has room for improve- [6] http://teiid.jboss.org/tools/adminshell/ ment. But almost natural language queries such as Google [7] http://blog.takipi.com/java-9-early-access-a-hands-on-session-with-jshell-the- answers questions like “what is 2 plus 2”, or a pretty NoSQL java-repl/ DB of its time like Q&A [8], offering such features ten years [8] https://en.wikipedia.org/wiki/Q%26A_(Symantec) before the Java language even started to have great potential. Instead of simply asking “2+2” questions, users may ask [9] http://www.javamoney.org what the temperature in their living room is, when backed [10] https://futureboy.us/frinkdocs/ by a Smart Home solution. Or using JSRs like 354, the re- [11] http://unitsofmeasurement.github.io/ cently finished Money API [9], questions like “2$ in CHF” or [12] https://dzone.com/articles/domain-specific-language-unit-

www.JAXenter.com | August 2015 15 Databases

Making the right database decisions MySQL is a great NoSQL Nowhere else are business decisions as hype-oriented as in IT. And while NoSQL is all well and good, MySQL is often the sensible choice in terms of operational cost and scalability. by Aviran Mordo keys, such as GUIDs. Also, when you have master-master replication, auto-increment causes confl icts, so you will NoSQL is a set of database technologies built to handle massive have to create key ranges for each instance. amounts of data or specifi c data structures foreign to relational • Any fi eld that is not indexed has no right to exist. Instead, databases. However, the choice to use a NoSQL database is we fold such fi elds into a single text fi eld (JSON is a good often based on hype, or a wrong assumption that relational choice). databases cannot perform as well as a NoSQL database. Op- erational cost is often overlooked by engineers when it comes We often use MySQL simply as a key-value store. We store to selecting a database. a JSON object in one of the columns, which allows us to ex- When building a scalable system, we found that an impor- tend the schema without making database schema changes. tant factor is using proven technology so that we know how Accessing MySQL by primary key is extremely fast, and we to recover fast if there’s a failure. Pre-existing knowledge and get submillisecond read time by primary key, which is excel- experience with the system and its workings – as well as being lent for most use cases. So we found that MySQL is a great able to Google for answers – is critical for swift mitigation. NoSQL that’s ACID-compliant. Relational databases have been around for over 40 years, and In terms of database size, we found that a single MySQL there is a vast industry knowledge of how to use and maintain instance can work perfectly well with hundreds of millions of them. This is one reason we usually default to using a MySQL records. Most of our use cases do not have more than several database instead of a NoSQL database, unless NoSQL is a hundred million records in a single instance. One big advan- signifi cantly better solution to the problem. tage to using relational databases as opposed to NoSQL is that However, using MySQL in a large-scale system may have per- you don’t need to deal with the eventually consistent nature formance challenges. To get great performance from MySQL, displayed by most NoSQL databases. Our developers all know we employ a few usage patterns. One of these is avoiding data- relational databases very well, and it makes their lives easy. base-level transactions. Transactions require that the database Don’t get me wrong, there is a place for NoSQL; relational maintains locks, which has an adverse effect on performance. databases have their limits – single host size and strict data Instead, we use logical application-level transactions, thus structures. Operational cost is often overlooked by engineers reducing the load and extracting high performance from the in favour of the cool new thing. If the two options are viable, database. For example, let’s think about an invoicing schema. we believe we need to really consider what it takes to main- If there’s an invoice with multiple line items, instead of writing tain it in production and decide accordingly. all the line items in a single transaction, we simply write line by line without any transaction. Once all the lines are written Aviran Mordo is the head of back-end engineering at Wix. He has over to the database, we write a header record, which has pointers twenty years of experience in the software industry and has fi lled many engineering roles and leading positions, from designing and building the to the line items’ IDs. This way, if something fails while writ- US national Electronic Records Archives prototype to building search en- ing the individual lines to the database, and the header record gine infrastructures. was not written, then the whole transaction fails. A possible downside is that there may be orphan rows in the database. We don’t see it as a signifi cant issue though, as storage is cheap From 0 to 60 Million Users: and these rows can be purged later if more space is needed. Scaling with Microservices and Multi-Cloud Architecture High-performance MySQL usage patterns Here are some of our other usage patterns to get great perfor- Hear Aviran Mordo speak at the JAX London: Many small startups mance from MySQL: build their systems on top of a traditional toolset. These systems are used because they facilitate easy development and fast pro- • Do not have queries with joins; only query by primary key gress, but many of them are monolithic and have limited scalabili- or index. ty. As a startup grows, the team is confronted with the problem of • Do not use sequential primary keys (auto-increment) be- how to evolve and scale the system. cause they introduce locks. Instead, use client-generated

www.JAXenter.com | August 2015 16 Data

Rethinking how we think about self-service cloud BI Business intelligence must evolve Every employee and every end user should have the right to find answers using data analytics. But the current reliance on IT for key information is creating an unnecessary bottleneck, says DataHero’s Chris Neumann. by Chris Neumann normalization and categorization, self-service cloud BI isn’t possible. “Self-service” is a term that gets used a lot in the business intelligence (BI) space these days. In reality, data analytics has The whole is greater than the sum of its parts largely ignored the group of users that really need self service, While self-service cloud BI is already possible, the users are even as that user base has grown. More than ever people realize often new to the world of data analytics. That means that the the value of data, but non-technical users are still left out of the tools too must evolve as the users become more sophisticated conversation. While everything from storage to collaboration and new possibilities emerge. tools have become simple enough for anyone to download and For example, without data analytics, a marketer might log begin using, BI and data analytics tools still require end users into a Google Analytics dashboard, then MailChimp, then to be experts or seek the help of experts. That needs to change. Salesforce to take the pulse of a marketing campaign. Each ser- Users should be able to get up and running on data analyt- vice provides its own value, but when combined the marketer ics and connect to the services they use most, easily. More em- can use a common attribute, like email address, and create a ployees in every department are expected to make decisions third dataset. What comes out of that is a much more pure based on their data, but that doesn’t mean everyone needs answer to the marketers question: “how successful is my cam- to be a data analyst or data scientist. Business users want to paign?” analyse data that lives in the services they use everyday, like Google Analytics, MailChimp and Salesforce are a common Google Analytics, HubSpot, Marketo, and Shopify – and even combination but there are many combinations that may be Excel, and know the questions they need answered. What just as valuable but have yet to be explored. With the prolifer- they need are truly self-service tools to get those answers. ation of cloud applications, the possibilities are nearly endless. The new users of BI and data analytics have also never had Calls for change the opportunity to work with one another. To continue with While vendor jargon and the obsession with big data may be the example, a marketer may have created the charts needed clouding the self-service cloud BI conversation, experts and to monitor KPIs and put them into a dashboard, but these enterprises are recognizing that things need to change. Leading KPIs need to be shared with internal teams, clients and execu- analyst firms like Forrester and Gartner are recognizing that tives. Reporting is normally a one-way process when it should BI must evolve. When business users depend on IT teams to be iterative and collaborative and allow clients and executives get answers, a bottleneck is created. End users are demanding to provide real feedback on the most up-to-date numbers. tools they can use on their own without having to go to IT. There are a number of vendors connecting to cloud services. The consumerization of BI But, connecting in a way that facilitates effective data analysis BI and data analytics have largely missed the consumerization presents a myriad of additional challenges from navigating of IT trend, despite industry-wide use of the term self service. the sheer variety of formats to categorizing unspecified data. That doesn’t mean that change isn’t coming. The shift to the At DataHero, we’ve built the requisite connectors for ac- cloud is continuing to accelerate and the emerging self-service cessing the data within cloud services. We’ve also taken the cloud BI space is quickly heating up, driven by user demand next steps with a data classification engine that automates and a need to decouple analytics from IT. ETL and recognizes that what a cloud service might call “text” is actually an important field. In order to successfully integrate Chris Neumann is the founder and Chief Product Officer of DataHero, these connections, solutions must automatically normalize the where he aims to help everyone unmask the clues in their data. Previ- ously he was the first employee at Aster Data Systems and describes data from disparate services, matching attributes and allow- himself as a data-analytics junkie, a bona fide techie and a self-proclaimed ing the data to be combined and analysed. Without automatic foodie.

www.JAXenter.com | August 2015 17 2010 In london sInce

The Conference for Java & Software Innovation

grou P Dis CounT save 30%

October 12 – 14th, 2015 Business Design Centre, London

The Enterprise Conference on Java, Web & Mobile, Developer Practices, Agility and Big Data!

www.jaxlondon.com

Presented by Organized by October 12th – 14th, 2015 Business Design Centre, London

Join us for JaX London 2015

JAX London provides a 3 day conference experience for cutting Learn how to increase your productivity, identify which tech edge software engineers and enterprise level professionals with nologies and practices suit your specific requirements and attendees from across the globe. JAX brings together the world’s learn about new approaches. Monday is a preconference work leading Java and JVM experts as well as many innovators in the shop and tutorial day. The halfday and fullday workshops are fields of Microservices, Continuous Delivery and DevOps to share overseen by experts. On Tuesday and Wednesday see the their knowledge and experience. In the spirit of agile methodo proper conference taking place – with more than 60 technical logy and lean business, JAX London is the place to define the sessions, keynotes, the JAX Expo, community events and more. next level of ultraefficient and superadaptive technology for For more information and the latest news about the conference and your organization. our speakers check out www.jaxlondon.com.

Keynotes

Jeff sussna (ingineering.iT) VC from the inside – a techie’s perspective Jeff Sussna is Founder and Principal of Ingineering.IT, a After many years in CTO roles with SpringSource, VMware, and Pivotal, Minneapolis technology consulting firm that helps enter and having experienced what it is like to work in a VCbacked company, prises and SoftwareasaService companies adopt 21st in June of 2014 Adrian switched sides and joined the venture capital firm century IT tools and practices. Jeff has nearly 25 years of Accel Partners in London. So what exactly does a technologist do inside IT experience. He has led highperformance teams across a venture capital firm? And having been part of the process from the the Development/QA/Operations spectrum. He has a track record of inside, how do investment decisions get made? In this talk Adrian will driving quality improvements through practical innovation. Jeff has done share some of the lessons he’s learned since embedding in the world of work for a diverse range of companies, including Fortune 500 enterpris venture capital, and how you can maximise your chances of investment es, major technology companies, software product and service startups, and a successful companybuilding partnership. and media conglomerates. Jeff combines engineering expertise with the ability to bridge business, creative, and technical perspectives. He has the insight and experience rachel Davies (unruly) to uncover problems and solutions other miss. He is a highly soughtaf Rachel Davies coaches product development teams at ter speaker and writer respected for his insights on topics such as Agile, Unruly (tech.unruly.co) in London. She is author of “Agile DevOps, Service Design, and cloud computing. Coaching” and an invited speaker at industry events Jeff’s interests focus on the intersection of development, operations, around the globe. Her mission is to create workplaces design, and business. He is the author of “Designing Delivery: Rethink where developers enjoy delivering valuable software. Ra ing IT in the Digital Service Economy”. Designing Delivery explores the chel is a strong advocate of XP approaches and an organiser of Extreme relationship between IT and business in the 21stcentury, and presents a Programmers London meetup. unified approach to designing and operating responsive digital services. The Art of shifting Perspectives From Design Thinking to Devops and Back Again: unify- Developers love writing code but to build resilient industryscale systems ing Design and operations we often need to persuade others to make changes to both code and The era of digital service is shifting customers’ brand expectations working practices. As a coach, my job is to help developers spot areas from stability to responsiveness. Optimizing delivery speed is only half for improvement and act on their ideas. Core to this work is opening up of this new equation. Companies also need to optimize their ability to different ‘ways of seeing’ the work that lies ahead. listen and to act on what they hear. In order to maximize both velocity In this new talk, I will share some stories of changes the teams I work and responsiveness, companies need to transform upfront design into with have made and explain some mechanisms that we applied to make a continuous, circular designoperations loop that unifies marketing, changes. Teams I work with at Unruly use eXtreme Programming (XP) design, development, operations, and support. techniques to build our systems. Modern XP has many counterintuitive practices — such as mob and pair programming. How did new ways of Adrian Colyer (Accel Partners) seeing old problems help us resolved them? Adrian is a Venture Partner with Accel Partners in Lon Come along to this talk to hear about some practical techniques you can don, and the author of “The Morning Paper,” where he use to help solve tricky problems and get others on board with your idea reviews an interesting CSrelated paper every weekday. by shifting perspective. He’s also an advisor to ClusterHQ, Skipjaq, and Weave works. Previously Adrian served in CTO roles at Pivotal, VMware, and SpringSource. Adrian’s extensive open source experience includes working with the teams that created the Spring Framework and related Spring projects, Cloud Foundry, RabbitMQ, Redis, Groovy, Grails, and AspectJ, as well as with team members making significant contribu tions to Apache Tomcat and Apache HTTP server.

jaxlondon.com October 12th – 14th, 2015 Business Design Centre, London

Timetable

Monday – October 12th 09:00 – 17:00 Design & Implementation of Microservices James Lewis Designing and Operating User-Centered Digital Services Jeff Sussna Workshop: Lambdas and Streams in Java 8 Angelika Langer, Klaus Kreft Workshop: Crafting Code Sandro Mancuso Workshop on Low Latency logging and replay Peter Lawery

Tuesday – October 13th KeYNOTe: From Design Thinking to DevOps and Back Again: Unifying Design and 09:00 – 10:00 Jeff Sussna Operations 10:15 – 11:05 Benchmarking: You’re Doing It Wrong Aysylu Greenberg The Performance Model of Streams in Java 8 Angelika Langer Open Source workflows with BPMN 2.0, Java and Camunda BPM Niall Deehan DevOps, what should you decide, when, why & how Vinita Rathi 11:40 – 12:10 Java Generics: Past, Present and Future Richard Warburton, Raoul-Gabriel Urma 11:40 – 12:30 Smoothing the continuous delivery path – a tale of two teams Lyndsay Prewer 14:30 – 15:20 2000 Lines of Java or 50 Lines of SQL? The Choice is Yours Lukas Eder From 0 to 60 Million Users: Scaling with Microservices and Multi-Cloud Architecture Aviran Mordo How to defeat feature gluttony? Kasia Mrowca 15:50 – 16:40 Costs of the Cult of Expertise Jessica Rose Cluster your Application using CDI and JCache Jonathan Gallimore Distributed Systems in one Lesson Tim Berglund Garbage Collection Pause Times Angelika Langer Technology Innovation Diffusion Jeremy Deane 17:10 – 18:00 Continuous delivery – the missing parts Paul Stack Pragmatic Functional Refactoring with Java 8 Richard Warburton, Raoul-Gabriel Urma Preparing your API Strategy for IoT Per Buer Use your type system; write less code Samir Talwar A pattern language for microservices Chris Richardson 18:15 – 18:45 All Change! How the new Economics of Cloud will make you think differently about Java Steve Poole, Chris Bailey Le Mort du Product Management Nigel Runnels-Moss 20:00 – 21:00 KeYNOTe: VC from the inside - a techie’s perspective Adrian Colyer

Wednesday – October 14th 09:00 – 09:45 KeYNOTe: The Art of Shifting Perspectives Rachel Davies 10:00 – 10:50 Advanced A/B Testing Aviran Mordo Architectural Resiliency Jeremy Deane Cassandra and Spark Tim Berglund Lambdas Puzzler Peter Lawrey 11:20 – 12:10 Coding for Desktop and Mobile with HTML5 and Java EE 7 Geertjan Wielenga Intuitions for Scaling Data-Centric Architectures Benjamin Stopford Microservices: From dream to reality in an hour Dr. Holly Cummins 12:20 – 13:10 Does TDD really lead to good design? Sandro Mancuso DevOps and the Cloud: All Hail the (Developer) King! Daniel Bryant, Steve Poole Fowler, Fielding, and Haussmann – Network-based Architectures Eric Horesnyi Java vs. JavaScript for Enterprise Web Applications Chris Bailey 15:30 – 16:20 The Dark Side of Software Metrics Nigel Runnels-Moss The Unit Test is dead. Long live the Unit Test! Colin Vipurs Events on the outside, on the inside and at the core Chris Richardson Architecting for a Scalable Enterprise John Davies

jaxlondon.com October 12th – 14th, 2015 Business Design Centre, London

JAX London Workshop Day

Workshop on Low Latency logging and replay Designing and operating user-Centered Digital services Peter Lawrey (Higher Frequency Trading Ltd) Jeff Sussna ( Ingineering.IT) A workshop for beginners to advanced developers on how to write and With software eating the world, 21stcentury business increasingly de read data efficiently in Java. The workshop will cover the following: An pends on IT, not just for operational efficiency, but for its very existence. advanced review of how the JVM really uses memory. what are refer In a highly disruptive service economy, ITdriven businesses must con ences. what is compressed OOPS. how are the fields in an object laid tinually adapt to everchanging customer needs and market demands. out. Using Maven to build a project using Chronicle. Setting up a simple To power the adaptive organization, IT needs to become a medium for maven project. Using modules from maven central. Assembling a maven continuous, empathic customer conversations. This workshop teaches build. How do memory mapped files work on Windows and Linux. Storing participants how to design and operate systems and organizations that data in a memory mapped file. Sharing data between JVMs via memory help businesses create value through customer empathy. It introduces mapped files. What is Unsafe and how does it work. Using Unsafe to them to the theory and practice of Continuous Design, a crossfunctional see the contents of an object in memory. Using Unsafe to access native practice that interconnects marketing, design, development, and opera memory. Writing and read data to a Chronicle Queue. Using raw bytes. tions into a circular design/operations loop. Participants learn how Using a wire format. Designing a system with low latency persisted IPC. to: • align software designs with operational, business, and customer simple order matching system example. Advanced content will be added needs •maximize quality throughout the design, development, and into the early sessions to keep advanced user interested and the later operations lifecycle •create highly resilient and adaptable systems, topic will have prebuilt working examples to build on. practices, and organizations. The workshop takes place in two sessions: Introduction to Continuous Design and Applying Continuous Design. Morning Introduction to Continuous Design: this session introduces Workshop: Crafting Code the principles of Continuous Design. It grounds those principles in the Sandro Mancuso (Codurance) historical, philosophical, and economic underpinnings that link method This course is designed to help developers write wellcrafted code—code ologies such as Design Thinking, Agile, DevOps, and Lean. By providing a that is clean, testable, maintainable, and an expression of the business strong theoretical grounding in new ways of knowing, this session gives domain. The course is entirely handson, designed to teach developers participants the ability to evaluate the effectiveness of specific tools practical techniques they can immediately apply to realworld projects. and practices, and to continually adapt them to meet their own needs Software Craftsmanship is at the heart of this course. Throughout, you and constraints. Afternoon Applying Continuous Design: this session will learn about the Software Craftsmanship attitude to development introduces a concrete methodology for applying Continuous Design to and how to apply it to your workplace. Writing Clean Code is difficult. realworld problems. Cleaning existing code, even more so. You should attend if you want to: Write clean code that is easy to understand and maintain. Become more proficient in TestDriven Development (TDD): using tests to design and Design & implementation of Microservices build your code base. Focus your tests and production code according to James Lewis (ThoughtWorks) business requirements using OutsideIn TDD (a.k.a. the London School Microservices Architecture is a concept that aims to decouple a solution of TDD) Clean code necessitates good design. In the process of driving by decomposing functionality into discrete services. Microservice archi your code through tests, you will learn how to: Understand design prin tectures can lead to easier to change, more maintainable systems which ciples that lead to clean code Avoid overengineering and large rewrites can be more secure, performant and stable. In this workshop you will by incrementally evolving your design using tests Once you have an discover a consistent and reinforcing set of tools and practices rooted understanding of the principles at work, we will apply them to Legacy in the philosophy of small and simple that can help you move towards a Code to help you gain confidence in improving legacy projects through Microservice architecture in your own organisation. Small services, com testing, refactoring and redesigning. The content will be: TDD lifecycle municating via the web’s uniform interface with single responsibilities and the OutsideIn style of TDD Writing unit tests that express intent, not and installed as well behaved operating system services. However, with implementation Using unit tests as a tool to drive good design Expres these finergrained systems come new sources of complexity. What you sive code Testing and refactoring Legacy Code. will learn: During this workshop you will understand in more depth what the benefits are of finergrained architectures, how to break apart your existing monolithic applications, and what are the practical concerns of Workshop: Lambdas and streams in Java 8 managing these systems. We will discuss how to ensure your systems Angelika Langer (Angelika Langer Training/Consulting), can be made more stable, how to handle security, and how to handle Klaus Kreft (Klaus Kreft) the additional complexity of monitoring and deployment. We will cover This workshop is devoted to the stream framework, which is an exten the following topics: Principledriven evolutionary architecture Capability sion to the JDK collection framework. Streams offer an easy way to modelling and the town planning metaphor REST, web integration and parallelize bulk operations on sequences of elements. The stream API eventdriven systems of systems Microservices, versioning, consumer differs from the classic collection API in many ways: it supports a fluent driven contracts and Postel’s law. Who should attend: Developers, Archi programming style and borrows elements from functional languages. For tects, Technical Leaders, Operations Engineers and anybody interested instance, streams have operations such as filter, map, and reduce. The in the design and architecture of services and components. new language features of lambda expressions and method references have been added to Java for effective and convenient use of the Stream API.In this workshop we will introduce lambda expressions and method/ constructor references, give an overview of the stream operations and discuss the performance characteristics of sequential vs. parallel stream operations. Attendants are encouraged to bring their notebooks. We will not only explore the novelties in theory, but intend to provide enough in formation to allow for handson experiments with lambdas and streams.

jaxlondon.com Cloud ©istockphoto.com/Peter Booth ©istockphoto.com/Peter Database solutions The future of cloud computing

The cloud has changed everything. And yet the cloud revolution at the heart of IT is only getting started. As data becomes more and more important, we’re beginning to realise how central a role the database will play in future.

by Zigmars Raascevskis tenant distributed databases, a certain amount of servers in a cloud footprint are set aside for managing databases, but Cloud computing engines today allow businesses to easily ex- these resources are shared by many users. This opens up tend their IT infrastructure at any time. This means that you the possibility for improving speed and efficiency of the IT can rent servers with only a few clicks, and various software infrastructure within organizations. A combined database stacks including web-servers, middleware and databases can footprint has massive resources and the ability to parallelize be installed and run on to these server instances with little-to- a much wider range of requests than users with their own no effort. With data continuing to aggregate at a rapid speed, dedicated servers. Such a setup allows faster run time and the database is becoming a large part of this infrastructure. avoids the painful sizing and provisioning process associ- By leveraging conventional cloud computing, every business ated with on-premise infrastructure and traditional cloud can run its own database stack in cloud the same way as if it computing. So what should businesses look for when select- were on-premise. ing a database solution? A multi-tenant database solution is There’s still a huge amount of potential to accelerate speed worth considering given it can help overcome the following and efficiency by using a multi-tenant database. For multi- challenges.

www.JAXenter.com | August 2015 22 Cloud

“Distributed databases can serve as a solid foundation for distributed computing that is massively parallel and instantly scalable.”

I – Failure tolerance of distributed systems unpredictably, the software stack is able to control the queues By design, distributed systems with state replication are resis and can make sure that only the offender is affected and other tant against most forms of single machine failures. Guarding users whose resource usage patterns are unchanged remain against single machine hardware failures is relatively straight- unaffected. Additionally, management of requested queues forward. With the distributed database design, every database can ensure, through prioritization, that the end user’s latency is hosted on multiple machines that replicate each partition metrics are optimised by picking the next request from the several times. Therefore, in the case of server failure, each sys- queue. tem routes traffic to healthy replicas to make sure that data is replicated elsewhere – ensuring higher availability. However, III: ACID-compliant transactions: A NoSQL challenge making distributed systems tolerant against software failures Another obstacle for massively paralleled distributed systems is much more difficult due to common cause and presents a has been consistency guarantees. For NoSQL distributed da- difficult challenge. The ultimate power of distributed systems tabases, ensuring transactional consistency and ACID proper- comes from parallelism, but this also means that the same ties have been a real problem. This is due to the fact that with code is executed on every server participating in fulfilling the a distributed database, many nodes have to be involved in request. If working on a particular request causes a fatal fail- processing the transaction and it is not obvious how to act in ure that has a negative impact on the operation of a system cases of failure. Plus, the state of the cluster has to be synchro- or even crashes it, this means the entire cluster is immediately nized to ensure consistency, which presents high overheads in affected. a highly distributed environment. Sophisticated methods are necessary to avoid such correlat- Instead of compromising performance or consistency, in- ed failures, which might be rare, that have devastating effects. vestment needs to be made to make database software scale One method involves trying each query on a few isolated while preserving consistency. For example, transactional computational nodes before sending it down to the entire consistency can be managed through the use of a transaction cluster with massive parallelism. Once failures are observed log, which can in turn, be distributed and replicated for high in the sandbox, suspicious requests are immediately quaran- throughput and durability. tined and isolated from the rest of the system. Distributed databases can serve as a solid foundation for distributed computing that is massively parallel and in- II – Performance guarantees in a multi-tenant environment stantly scalable. In this respect NoSQL technologies and Another common problem that often manifests itself in public its community can leverage this trend to contribute to the clouds is the “noisy neighbour” issue. When many users share architecture of a “future computer”. By understanding computational resources, it is important to ensure that they the benefits of a multi-tenant system and adopting the ap- are prioritized and isolated properly so that sudden changes propriate solutions, organizations can experience instant in behaviour of one user do not have an adverse impact on an- scalability and massive parallelism within their own data other. A common approach for computing engines has been infrastructures. isolation of resources into containers. This requires giving each user a certain sized box that it cannot break out from – providing a level of isolation – however, it’s not flexible in terms of giving users enough resources exactly when they need them. Effective workload scheduling, low-level resource prioritization and isolation are key techniques to achieving a predictable performance. A multi-tenant database software stack actually provides more opportunities to share and prioritize resources dynamically while providing performance guarantees. This is possible Zigmars Raascevskis left a senior engineering position at Google to join because the database software can manage access of critical Clusterpoint as the company CEO, foreseeing that document oriented da- resources like a CPU core or a spinning disk through a queue tabases would take the market by storm. Prior to joining Clusterpoint, of requests that are accessing the resource. The provisioning Zigmars worked for 8 years at Google, where among other projects he managed the web search backend software engineering team in Zurich. process ensures that there are enough aggregated resources Before his Google career, Zigmars worked for Exigen, a leading regional IT company, in the cluster. However, in the case that some user behaves and Lursoft, a leading regional information subscription service company.

www.JAXenter.com | August 2015 23 Tests

Dos and Don’ts Testing the Database Layer

There’s one thing we can agree on when it comes to database tests: they ain’t easy. Testing guru and JAX London speaker Colin Vipurs runs through the strengths and weaknesses of common approaches to testing databases.

by Colin Vipurs A final note on mocking – no sane developer these days would be using raw JDBC, but one of the higher-level abstrac- Over my many years of software development I’ve had to per- tions available, and the same rules apply for these. Imagine form various levels of testing against many different database a suite of tests setup to mock against JDBC and your code instances and types including RDBMS and NoSQL, and one switches to Spring JdbcTemplate, jOOQ or Hibernate. Your thing remains constant – it’s hard. There are a few approaches tests will now have to be rewritten to mock against those that can be taken when testing the database layer of your frameworks instead – not an ideal solution. code and I’d like to go over a few of them pointing out the strengths and weaknesses of each. Testing Against a Real Database It may sound silly, but the best way to verify that your data- Mocking base interaction code works as expected is to actually have This is a technique that I have used in the past but I would highly recommend against doing now. In my book “Tests Need Love Too” I discuss why you Listing 1 should never mock any third-party interface, but just @Test in case you haven’t read it (you really should!) I’ll go public void testJdbc() { over it again. final Connection connection = context.mock(Connection.class); As with mocking any code you don’t own, what final ResultSet resultSet = context.mock(ResultSet.class); you’re validating is that you’re calling the third-party final PreparedStatement preparedStatement = context.mock(PreparedStatement.class); code in the way you think you should, but, and here’s final States query = context.states("query").startsAs("pre-prepare"); the important part – this might be incorrect. Unless you have higher lever tests covering your code, you’re context.checking(new Expectations() {{ not going to know until it hits production. In addition oneOf(connection).prepareStatement("SELECT firstname, lastname, occupation FROM users"); to this, mocking raw JDBC is hard, like really hard. then(query.is("prepared")); Take for example the test code snippet in Listing 1. will(returnValue(preparedStatement)); Within this test, not only are there a huge amount oneOf(preparedStatement).executeQuery(); of expectations to setup, but in order to verify that all oneOf(resultSet).next(); when(query.is("executed")); then(query.is("available")); the calls happen in the correct order, jMock “states” oneOf(resultSet).getString(1); when(query.is("available")); will(returnValue("Hermes")); are used extensively. Because of the way JDBC works, oneOf(resultSet).getString(2); when(query.is("available")); will(returnValue("Conrad")); this test also violates the guidelines of never having oneOf(resultSet).getString(3); when(query.is("available")); will(returnValue("Bureaucrat")); mocks returning mocks and in fact goes several levels oneOf(resultSet).close(); when(query.is("available")); deep! Even if you manage to get all of this working, oneOf(preparedStatement).close(); when(query.is("available")); something as simple as a typo in your SQL can mean }}); that although your tests are green this will still fail } when your code goes to production.

www.JAXenter.com | August 2015 24 Tests

it interact with a database! As well as ensuring you’re using you’re using a different DataSource to your production in- your chosen API correctly this technique can verify things that stance it can be easy to miss confi guration options required mocking never can, for example, your SQL is syntactically to make the Driver operate correctly. Recently I came across correct and does what you hope. such a setup where H2 was confi gured to use a DATETIME In-Memory Databases: One the easiest and quickest ways column requiring millisecond precision. The same schema to get setup with a database to test against is to use one of defi nition was used on a production MySQL instance which the in-memory versions available, e. g. H2, HSQL or Derby. not only required this to be DATETIME(3) but also needs the If you’re happy introducing a Spring dependency into your useFractionalSeconds=true parameter provided to the driver. code, then the test setup can be as easy as this (Listing 2). This issue was only spotted after the tests were migrated from This code will create an instance of the H2 database, load using H2 to a real MySQL instance. the schema defi ned in schema.sql and any test data in test-da- Real Databases: Where possible I would highly recommend ta.sql. The returned object implements javax.sql.DataSource testing against a database that’s as close as possible to the one so can be injected directly into any class that requires it. being run in your production environment. A variety of factors One of the great benefi ts of this approach is that it is fast. can make this diffi cult or even impossible, such as commercial You can spin up a new database instance for each and every databases requiring a license fee meaning that installing on test requiring it giving you a cast iron guarantee that the data each and every developer machine is prohibitively costly. A is clean. You also don’t need any extra infrastructure on your classic way to get around this problem is to have a single devel- development machine as it’s all done within the JVM. This opment database available for everyone to connect to. This in mechanism isn’t without its drawbacks though. itself can cause a different set of problems, not least of which is Unless you’re deploying against the same in-memory da- performance (these always seem to get installed on the cheap- tabase that you’re using in your test, inevitably you will run est and oldest hardware) and test repeatability. The issue with up against compatibility issues that won’t surface until you sharing a database with other developers is that two or more hit higher level testing or god forbid – production. Because people running the tests at the same time can lead to incon- sistent results and data shifting in an unexpected way. As the number of people using the database grows, this problem gets Listing 2 worse – throw the CI server into the mix and you can waste a lot of time re-running tests and trying to fi nd out if anyone else {code} is running tests right now in order to get a clean build. public class EmbeddedDatabaseTest { If you’re running a “free” database such as MySQL or one private DataSource dataSource; of the many free NoSQL options, installing on your local development machine can still be problematic – issues such @Before as needing to run multiple versions concurrently or keeping public void createDatabase() { everyone informed of exactly what infrastructure needs to be dataSource = new EmbeddedDatabaseBuilder(). up and what ports they need to be bound to. This model also setType(EmbeddedDatabaseType.H2). requires the software to be up and running prior to perform- addScript("schema.sql"). ing a build making onboarding staff onto a new project more addScript("test-data.sql"). time consuming than it needs to be. build(); Thankfully over the last few years several tools have ap- } peared to ease this, the most notable being Vagrant and Dock- er. As an example, spinning up a local version of MySQL in @Test Docker can be as easy as issuing the following command: public void aTestRequiringADataSource() { // execute code using DataSource $ docker run -p 3306:3306 -e MYSQL_ROOT_PASSWORD=bob mysql } } This will start up a self-contained version of the latest MySQL mapped to the local port of 3306 using the root password provided. Even on my 4 year old MacBook Pro, after the ini- The Unit Test is dead. Long tial image download, this only takes 12 seconds. If you need live the Unit Test! Redis 2.8 running as well you can tell Docker to do that too:

Hear Colin Vipurs speak at the JAX London: Unit tests are the life- $ docker run -p 6389:6389 redis:2.8 blood of any modern development practise, helping developers not only ensure the robustness of their code but to also speed up Or the latest version running on a different local port: the development cycle by providing fast feedback on code changes. In reality this isn’t always the case and even with the most $ docker run -p 6390:6389 redis:latest diligent of refactorings applied, unit tests can actually become a hindrance to getting the job done effectively. This can be easily plugged into your build system to make the whole process automated meaning the only software your

www.JAXenter.com | August 2015 25 Tests

developers need on the local machine is Docker (or Vagrant) event of a test failure, having the database still populated is an and the infrastructure required for the build can be packaged easy way to diagnose problems. Cleaning up state at the end into the build script! of the test leaves you no trace and as long as every test follows Testing Approach: Now you have your database up and this pattern you’re all good. running the question becomes “how should I test?”. Depend- A popular technique for seeding test data is to use a tool ing on what you’re doing the answer will vary. A greenfield like DbUnit which lets you express your data in files and have project might see a relational schema changing rapidly in the it easily loaded. I have two problems with this; the first is early stages whereas an established project will care more that if you’re using a relational database there is duplication about reading existing data. Is the data transient or long between the DB schema itself and the test data. Not only does lived? Most* applications making use of Redis would be do- a schema change require changing the dataset file(s) but the ing so with it acting like a cache so you need to worry less test data is no longer in the test class itself meaning a context about reading existing data. switch between tests and data. For an example of a of DbUnit * Most, not all. I’ve worked with a fair few systems where XML file see Listing 3. Redis is the primary data store. One question I usually hear from newcomers to DB test- The first thing to note is that for functional tests the best ing is whether they should round-trip the data or poke the thing to do is start with a clean, empty database. Repeatabil- database directly for verification. Round-tripping is an im- ity is key and an empty database is a surefire way to ensure portant part of the testing cycle as you really need to know this. My preference is for the test itself to take care of this, that the data you’re writing can be read back. An issue with purging all data at the beginning of the test, not the end. In the this though is that that you’re essentially testing two things at once, so if there is a failure on one side it can be hard to determine what that is. If you’re using TDD (of course you are) Listing 3 then tackling the problem will likely feel very uncomfortable as the time between red and green can be quite high and you won’t be getting the fast feedback you’re used to. drawbacks of each. The first test I write will be a pure read code will bypass any logic the write path might make. For example, an insert that has an “ON DUPLICATE KEY” clause will not do this and make the assumption this record does not exist as the test is in complete control of the state of the data. Listing 4 The test will then use the production code to read back what the test has inserted and presto, the read back is verified. An def "existing user can be read"() { example of a read test can be seen in Listing 4. given: Once the read path is green, the write tests will round-trip sql.execute('INSERT INTO users (id, name) VALUES (1234, "John Smith")') the data using production code for both writing and reading. when: Because the read path is known to be good, there is only the def actualUser = users.findById(1234) write path to worry about. A failure on the read path at some then: point in the future will cause both sets of tests to fail, but a actualUser.id == 1234 failure only on the write path helps isolate where the prob- actualUser.name == 'John Smith' lem is. In addition, if you’re using a test DSL for verifying } the read path, it can be reused here to save you time writing those pesky assertions! An example of a round-trip test can be seen in Listing 5. Listing 5 def "new user can be stored"() { given: def newUser = new User(1234, "John Smith") when: users.save(newUser) then def actualUser = users.findById(1234) Colin Vipurs started professional software development in 1998 and re- actualUser.id == 1234 leased his first production bug shortly after. He has spent his career working in a variety of industries using a wide range of technologies always actualUser.name == 'John Smith' attempting to release bug-free code. He holds a MSc from Liverpool Uni- } versity and currently works at Shazam as a Developer/Evangelist. He has spoken at numerous conferences worldwide.

www.JAXenter.com | August 2015 26 Finance IT

Financial services PaaS and private clouds: Managing and monitoring disparate environments Private cloud trends

Not all enterprises and IT teams can enjoy the luxuries of the public cloud. So let’s take a look at the limits and the risks of the alternative: the private cloud and PaaS.

by Patricia Hines flexibility, followed by a transformed IT environment with optimized systems of record and empowered developers. For Financial Institutions (FIs) find that deploying PaaS and IaaS those citing improved IT manageability and flexibility, there is solutions within a private cloud environment is an attractive a desire to collect, analyse and centralize error and event logs alternative to technology silos created by disparate server to manage and monitor performance against SLAs. For those hardware, operating systems, applications and application adopting private cloud to empower developers, the choice is programming interfaces (APIs). Private cloud deployments viewed as a foundational element to allow developer self-ser- enable firms to take a software-defined approach to scaling vice for provisioning application environments and deploy- and provisioning hardware and computing resources. ing code throughout the application lifecycle. PaaS promises While other industries have long enjoyed the increased to abstract applications from their underlying infrastructure, agility, improved business responsiveness and dramatic cost enabling faster deployment and time to market. savings by shifting workloads to public clouds, many firms in highly regulated industries like financial services, health- Limitations of private cloud and PaaS care and government are reluctant to adopt public cloud. As Most large banks have thousands of systems in place to sup- a result of increased regulatory and compliance scrutiny for port millions of customers. They host these systems on a these firms, the potential risks of moving workloads to public complex, heterogeneous mix of systems, many of which have clouds outweigh any potential savings. been in place for a long time. For example, many core banking systems are still running on IBM mainframes and AS/400 Private cloud and PaaS trends platforms because of their security, reliability, scalability and The definition of what comprises a private cloud deployment resiliency. FIs continue to depend on third-party hosted ap- vary, with some analysts and vendors equating private cloud plications for functions ranging from bill pay to credit checks, with Infrastructure as a Service (IaaS) and others broadening which along with SaaS applications for CRM and HR man- the term to encompass both IaaS and Platform as a Service agement, will remain outside of the private cloud’s domain. (PaaS). Whatever the definition, many financial services firms As firms evaluate their private cloud architecture, they need have already deployed private cloud, IaaS and PaaS technolo- to consider how they can achieve their business goals of im- gies, often driven by platform simplification and consolida- proved IT manageability and empowered developers across a tion initiatives. heterogeneous, hybrid environment. Although it is possible Vendor platforms for private PaaS are gaining popularity to re-host and re-architect core legacy systems onto modern with a wide range of available proprietary and open source platforms like Java and .Net, these projects will extend far solutions. Proprietary vendors include Apprenda and Pivot- into the future. As a result, financial institutions need to man- al (which is a commercial version built on Cloud Foundry). age and monitor disparate environments, each with its own Open source platforms include Cloud Foundry, OpenShift, challenges and restrictions, for the foreseeable future. Apache Stratos and Cloudify. Many banks are choosing open When a FI adopts private cloud and PaaS technologies to source-based solutions as an insurance policy against vendor simplify IT management for application deployment, they are lock-in. Moreover, with the source code under the pressure adding another technology stack to the already complex mix. of public scrutiny, the quality of these applications is often To make matters worse, some FIs have deployed (or are eval- higher than their proprietary rivals. uating) multiple private cloud and PaaS platforms, often with disparate capabilities and restrictions, and proprietary APIs. Business drivers for private cloud and PaaS adoption With the mix of private cloud, IaaS, and PaaS environments According to Forrester, the top two business drivers for that must coexist with legacy infrastructure, critical “health” private cloud adoption are improved IT manageability and managing and monitoring becomes more difficult.

www.JAXenter.com | August 2015 27 Finance IT

able and reusable building blocks. Reusable building blocks accelerate time to market for new products and services whether packaged vs. custom, on-premise vs. off-premise. Rather than each developer needing to have a deep understanding of an external application’s API intricacies, they can use the integration layer to compose their applications with connectivity as needed to easily automate tasks, access databases and call web services by leveraging APIs. Private cloud, IaaS and PaaS technologies are on the IT agendas of many financial services firms. But those technologies are just one piece of the infrastructure puzzle. In order to simplify IT management and empower developers, you need a blending and bridging of environments that delivers agility across infrastructure silos. MuleSoft’s Anypoint Platform is the only solution that enables end-to-end connectivity across Figure 1: MuleSoft API, service orchestration and application integration in a single platform. Even if a firm decides to eventually re-architect legacy ap- The single platform enables IT organizations to take a bi- plications for private PaaS hosting or move workloads across modal approach to private cloud management – driving speed multiple PaaS solutions, it is critical that organizations de- to market and agility while enforcing a governance process to velop an overarching connectivity strategy to seamlessly tie avoid fragmentation and duplication of services. MuleSoft, together systems, data and workflow that accommodates a a proven on-premises, hybrid and cloud integration leader, long-term migration journey. In order for the organization to provides a virtual agility layer, allowing new services on the achieve a “single pane of glass” for managing and monitor- PaaS to interact with legacy on-premise mainframes or SaaS ing, organizations need the ability to connect and integrate environments in the cloud (Figure 1). the various environments and enable service discovery, nam- Each of the building blocks in Anypoint Platform delivers ing, routing, and rollback for SOAP web services, REST APIs, purposefully productized APIs, powerful Anypoint core and microservices and data sources. ubiquitous connectivity. Based on consistent and repeatable guiding principles, the Anypoint Platform delivers tools and Managing disparate environments services for runtime, design time, and engagement that en- The combination of endpoints – data sources, applications, able successful delivery for each audience, whether internal web services, APIs and processes – are ever growing and or external. MuleSoft’s Anypoint Platform is architecturally evolving. In order to orchestrate a well governed but agile independent – it is agnostic in terms of private cloud, IaaS application landscape, IT architects need to re-consider their or PaaS solutions, whether custom-built or purchased from a integration approach. A unified integration platform can han- third-party provider. Customers have the freedom and agility dle any type of integration scenario, particularly high-com- to abstract connectivity and integration from the underlying plexity requirements for high performance, throughput and infrastructure, platform and application environments maxi- security involving a combination of application, B2B, and mizing efficiency and business value. SaaS integration needs, whether on-premises or in the cloud. Part of simplifying your architecture and becoming more Organizations facing the need to manage heterogeneous ar- agile is having flexibility. MuleSoft’s unique connectivity ap- chitectural environments have an opportunity to address a proach allows you to plan for the future. You may start with wide range of requirements by means of a unified, full stack an established infrastructure provider and move to an emerg- for connectivity on one platform – connectivity, orchestra- ing pure-play PaaS provider. You may build applications for tion, services, and APIs. on-premises deployment but later decide to host them in As firms adopt multi-vendor solutions, they need a way to the cloud. Anypoint Platform has a single code base for on- abstract the complexity of their private cloud vendor and ar- premises, hybrid and cloud deployment, adapting to chang- chitecture decisions. With a unified connectivity solution, you ing business and regulatory conditions. This single code base can beta test multiple PaaS environments using an indepen ensures integration and interoperability across the enterprise dent orchestration layer with a single API layer to back end with transparent access to data, seamless monitoring and se- systems and databases. The connectivity layer helps you to curity, and the agility to respond to changing business needs. avoid PaaS vendor lock-in while increasing interoperability and data portability. A unified integration layer enables organizations to take an API-led connectivity approach for xPaaS (Application Platform-as-a-Service, Database Platform-as-a-Service, Mid- Patricia Hines is the financial services industry marketing director at dleware Platform-as-a-Service, etc.) integration and manage- MuleSoft, a San Francisco-based company that makes it easy to connect ment. API-led connectivity packages underlying connectivity applications, data and devices. and orchestration services into easily composable, discover

www.JAXenter.com | August 2015 28 Web ©istockphoto.com/retrorocket ©istockphoto.com/retrorocket The future of traffic management technology Intelligent traffic management in the modern application ecosystem

As application architecture continues to undergo change, modern applications are now living in increasingly distributed and dynamic infrastructure. Meanwhile, DNS and traffic management markets are finally shifting to accommodate the changing reality.

by Kris Beevers swarms of down-in-the-muck systems administrators with arcane knowledge of config files, firewall rules and network Internet based applications are built markedly different to- topologies. Applications were deployed in monolithic models, day than they were even just a few years ago. Application usually in a single datacentre – load balancers fronting web architectures are largely molded by capabilities of the infra- heads backed by large SQL databases, maybe with a caching structure and core services upon which the applications are layer thrown in for good measure. built. In recent years we’ve seen tectonic shifts in the ways Since the early 2000s, we’ve seen a dramatic shift toward infrastructure is consumed, code is deployed and data is “cloudification” and infrastructure automation. This evo- managed. lution has led to an increase in distributed application to- A decade ago, most online properties lived on physical pologies, especially when combined with the explosion of infrastructure in co-location environments, with dedicated database technologies that solve replication and consistency connectivity and big-iron database back ends, managed by challenges, and configuration management tools that keep

www.JAXenter.com | August 2015 29 Web

track of dynamically evolving infrastructures. Today, most traffic across datacentres are emerging. For example, some of new applications are built to be deployed – at minimum – in the largest properties on the Internet, including major CDNs, more than one datacentre, for redundancy in disaster recov- are today making traffic management decisions based not ery scenarios. Increasingly, applications are deployed at the only on whether a server is “up” or “down,” but on how far-flung “edges” of the Internet to beat latency and provide loaded it is, in order to utilize the datacentre to capacity, but lightning fast response times to users who’ve come to expect not beyond. answers (or cat pictures) in milliseconds. Several traffic management providers have emerged that As applications become more distributed, the tools we use measure response times and other metrics between an appli- to get eyeballs to the “right place” and to provide the best ser- cation’s end users and datacentres. These solutions leverage vice in a distributed environment have lagged behind. When data in real time to route users to the application endpoint an application is served from a single datacentre, the right that’s providing the best service, for the user’s network, right service endpoint to select is obvious and there’s no decision to now, ditching geographic routing altogether. Additional be made, but the moment an application is in more than one traffic management techniques, previously impossible in the datacentre, endpoint selection can have a dramatic impact on context of DNS, are finding their way to market, such as user experience. endpoint stickiness, complex weighting and prioritizing of Imagine someone in California interacting with an appli- endpoints, ASN and IP prefix based endpoint selection and cation served out of datacentres in New York and San Jose. more. If the user is told to connect to a server in New York, most The mechanisms and interfaces for managing DNS configu- times, they’ll have a significantly worse experience with the ration are improving, as new tools mature for making traffic application than if they’d connected to a server in San Jose. management decisions in the context of DNS queries. While An additional 60–80 milliseconds in round trip time is tacked legacy DNS providers restrict developers to a few proprietary onto every request sent to New York, drastically decreasing DNS record types to enact simplistic traffic management be- the application’s performance. Modern sites often have 60– haviours, modern providers offer far more flexible toolkits. 70 assets embedded in a page and poor endpoint selection can This enables developers to either write actual code to make impact the time to load every single one of them. endpoint selection decisions or offering flexible, easy to use rules engines to mix and match traffic routing algorithms into Solving endpoint selection complex behaviours. How have we solved endpoint selection problems in the past? The answer is, we haven’t – at least, not very effectively. What’s next for traffic management technology? If you operate a large network and have access to deep As with many industries, traffic management will be driven by pockets and a lot of low-level networking expertise, you data. Leading DNS and traffic management providers, such might take advantage of IP anycasting, a technique for rout- as NSONE, already leverage telemetry from application in- ing traffic to the same IP address across multiple datacentres. frastructure and Internet sensors. The volume and granularity Anycasting has proven too costly and complex to be applied of this data will only increase, as will the sophistication of to most web applications. the algorithms that act on it to automate traffic management Most of the time, endpoint selection is solved by DNS, decisions. the domain name system that translates hostnames to IP ad- DNS and traffic management providers have found addi- dresses. A handful of DNS providers support simple notions tional uses for this data outside of making real-time endpoint of endpoint selection for applications hosted in multiple da- selection decisions. DNS providers are already working with tacentres. For example, the provider might ping your serv- larger customers to leverage DNS and performance telemetry ers, and if a server stops responding, it is removed from the to identify opportunities for new datacentre deployments to endpoint selection rotation. More interestingly, the provider maximize performance impact. DNS based traffic manage- may use a GeoIP database or other mechanism to take a guess ment will be an integral part of a larger application delivery at who’s querying the domain and where they’re located, and puzzle that sees applications themselves shift dynamically send the user to the geographically closest application end- across datacentres in response to traffic, congestion and other point. These two simple mechanisms form the basis of many factors. large distributed infrastructures on the Internet today, includ- Applications and their underlying infrastructure have ing some of the largest content delivery networks (CDNs). changed significantly in the last decade. Now, the tools and In today’s modern Internet, applications live in increasingly systems we rely on to get users to the applications are finally distributed and dynamic infrastructure. The DNS and traf- catching up. fic management markets are finally shifting to accommodate these realities. Modern DNS and traffic management providers are be- Kris Beevers is an internet infrastructure geek and serial entrepreneur ginning to incorporate real-time feedback from application who’s started two companies, built the tech for two others, and has a infrastructures, network sensors, monitoring networks and particular specialty in architecting high volume, globally distributed internet infrastructure. Before NSONE, Kris built CDN, cloud, bare metal, and other sources into endpoint selection decisions. While basic other infrastructure products at Voxel, a NY based hosting company that health checking and geographic routing remain tools of the sold to Internap (NASDAQ:INAP) in 2011. trade, more complex and nuanced approaches for shifting https://www.crunchbase.com/person/kris-beevers#sthash.xmxgexW5.dpuf

www.JAXenter.com | August 2015 30 Benchmarks

Cost, scope and focus Trade-offs in benchmarking

Is it quality you’re looking to improve? Or performance? Before you decide on what kind of a benchmark your system needs, you need to know the spectrum of cost and benefi t trade-offs. by Aysylu Greenberg Speaking of revenue, the goal of a benchmark is to guide optimizations in the system and to defi ne the performance Benchmarking software is an important step in maturing a sys- goal. A good benchmark should be able to answer the ques- tem. It is best to benchmark a system after correctness, usabil- tion “How fast is fast enough?” It allows the company to ity, and reliability concerns have been addressed. In the typical keep the users of the system happy and keep the infrastruc- lifetime of a system, emphasis is fi rst placed on correctness of ture bills as low as possible, instead of wasting money on implementation, which is verifi ed by unit, functional, and in- unneeded hardware. tegration tests. Later, the emphasis is placed on the reliability There’s a spectrum of cost and benefi t trade-offs a bench- and usability of the system, which is confi rmed by the monitor- mark designer should be aware of. Specialized benchmarks ing and alerting setup of a system running in production for that utilize realistic workloads and model the production en- an extended period of time. At this point, the system is fully vironment closely are expensive to set up. A common problem functional, produces correct results, and has the necessary set is that special infrastructure needs to exist to be able to dupli- of features to be useful to the end client. At this stage, bench- cate the production workload. Aggregation and verifi cation marking the software helps us to gain a better understanding of results is also a very involved process, as it requires thor- of what improvement work is necessary to help the system gain ough analysis and application of moderately sophisticated a competitive edge. statistical techniques. On the other hand, micro-benchmarks There are two types of benchmarks one can create – perfor- are quick and easy to set up, but they often produce mislead- mance and quality. Performance benchmarks generally meas- ing results, since they might not be measuring a representative ure latency and throughput. In other words, they answer the workload or set of functionality. questions: “How fast can the system answer a query?”, “How To get started with designing a benchmark, it is helpful many queries per second can it handle?”, and “How many to pose a question for the system, e.g. “How fast does the concurrent queries can the system handle?” Quality bench- page load for the user when they click to see contents of their marks, on the other hand, address domain specifi c concerns, cart?” By pairing that with the goal of the benchmark, e.g. and do not translate well from one system to another. For “How fast does the page need to load for a pleasant user ex- instance, on a news website, a quality benchmark could be the perience?” this gives the team guidance for their optimization total number of clicks, comments, and shares on each article. work and helps to determine when a milestone is reached. In contrast, a different website may include not only those Benchmarking is both an engineering and a business prob- properties but also what the users clicked on. This might hap- lem. Clearly defi ning the question and the goal for the bench- pen because the website’s revenue is dependent on the number mark helps utilize compute and engineer hours effectively. of referrals, rather than how engaging a particular article was. When designing a benchmark, it’s important to consider how much “bang for the buck” the system will receive from the benchmarking work. Benchmarks with wide coverage of the Benchmarking: system’s functionality and thorough analysis of the results are You’re Doing It Wrong expensive to design and set up, but also provide more con- Hear Aysylu Greenberg speak at the JAX London: Knowledge of fi dence in the behaviour of the system. On the other hand, how to set up good benchmarks is invaluable in understanding smaller benchmarks might answer narrow questions very well performance of the system. Writing correct and useful benchmarks and help get the system closer to the goal much faster. is hard, and verifi cation of the results is diffi cult and prone to errors. When done right, benchmarks guide teams to improve the Aysylu Greenberg works at Google on a distributed build system. In her performance of their systems. In this talk, we will discuss what you spare time, she works on open source projects in Clojure, ponders the design of systems that deal with inaccuracies, paints and sculpts. need to know to write better benchmarks.

Five tips to stay secure Common threats to your VoIP system

VoIP remains a popular system for telephone communication in the enterprise. But have you ever considered the security holes this system is leaving you open to? And what company secrets are at risk of eavesdropping, denial of service and “Vishing” attacks?

by Sheldon Smith I – Transmission issues Unlike plain old telephone service (POTS), VoIP systems Using a VoIP system to handle calls for your company? rely on packet-switched telephony to send and receive mes- You’re not alone. In 2014, the worldwide VoIP services sages. Instead of creating a dedicated channel between two market reached almost $70 billion and is on pace for an- endpoints for the duration of a call using copper wires and other banner year in 2015. Despite the usability, flexibility analog voice information, call data is transmitted using thou- and cost effectiveness of VoIP systems, companies need to sands of individual packets. By utilizing packets, it’s possible be aware of several common threats that could dramatically to quickly send and receive voice data over an internet con- increase costs or put company secrets at risk. Here are five of nection and VoIP technologies are designed in such a way that the most common VoIP threats and how your company can packets are re-ordered at their destination so calls aren’t out stay secure. of sync or jittery.

www.JAXenter.com | August 2015 32 Security

formation including phone numbers, account PINs and users’ personal data. Impersonation is also possible – hackers “Dealing with these threats can leverage your VoIP system to make calls and pose as a member of your company. Worst case scenario? Customers means undertaking a secu- and partners are tricked into handing over confidential information. rity audit of your network Handling this security threat means developing policies and procedures that speak to the nature of the problem. IT depart- before adding VoIP.” ments must regularly review who has access to the VoIP system and how far this access extends. In addition, it’s critical to log and review all incoming and outgoing calls.

What’s the risk? The transmission medium itself. POTS IV – Vishing lines are inherently secure since a single, dedicated connection According to the Government of Canada’s “Get Cyber Safe” is the only point of contact between two telephones. Though website, another emerging VoIP threat is voice phising or when voice data is transmitted over the internet at large, it “vishing”. This occurs when malicious actors redirect le- becomes possible for malicious actors to sniff out traffic and gitimate calls to or from your VoIP network and instead either listen in on conversations or steal key pieces of data. connect them to online predators. From the perspective of The solution? Encrypt your data before it ever leaves local an employee or customer the call seems legitimate and they servers. You’ve got two choices here: Set up your own encryp- may be convinced to provide credit card or other information protocols in-house, or opt for a VoIP vendor that bundles tion. Spam over Internet Telephony (SPIT) is also a growing a virtual private network (VPN), which effectively creates a problem; here, hackers use your network to send thousands secure “tunnel” between your employees and whoever they of voice messages to unsuspecting phone numbers, damag- call. ing your reputation and consuming your VoIP transmission capacity. To manage this issue, consider installing a II – Denial of service separate, dedicated internet connection for your VoIP alone, The next security risk inherent to VoIP? Attacks intended to allowing you to easily monitor traffic apart from other in- slow down or shut down your voice network for a period ternet sources. of time. As noted by a SANS Institute whitepaper, malicious attacks on VoIP systems can happen in a number of ways. V – Call fraud First, your network may be targeted by a denial of service The last VoIP risk comes from the call fraud, also called toll (DOS) flood, which overwhelms the system. Hackers may fraud. This occurs when hackers leverage your network to also choose buffer overflow attacks or infect the system with make large volume and lengthy calls to long-distance or “pre- worms and viruses in attempt to cause damage or prevent mium” numbers, resulting in massive costs to your company. your VoIP service from being accessed. As noted by a recent In cases of toll fraud, meanwhile, calls are placed to revenue- CBR article, VoIP attacks are rapidly becoming a popular generating numbers – such as international toll numbers – avenue for malicious actors – UK-based Nettitude said that which generate income for attackers and leave you with the within minutes of bringing a new VoIP server online, attack bill. volumes increased dramatically. Call monitoring forms part of the solution here, but it’s also Dealing with these threats means undertaking a security critical to develop a plan that sees your VoIP network regu- audit of your network before adding VoIP. Look for insecure larly patched with the latest security updates. Either create a endpoints, third-party applications and physical devices that recurring patch schedule or find a VoIP provider that auto- may serve as jumping-off points for attackers to find their matically updates your network when new security updates way into your system. This is also a good time to assess leg- become available. acy apps and older hardware to determine if they’re able to VoIP systems remain popular thanks to their ease-of-use, handle the security requirements of internet-based telephony. agility and global reach. They’re not immune to security is- It’s also worth taking a hard look at any network protection sues – but awareness of common threats coupled with proac- protocols and firewalls to determine if changes must be made. tive IT efforts helps you stay safely connected. Best bet? Find an experienced VoIP provider who can help you assess existing security protocols.

III – Eavesdropping Another issue for VoIP systems is eavesdropping. If your traffic is sent unencrypted, for example, it’s possible for motivat- ed attackers to “listen in” on any call made. The same goes Sheldon Smith is a Senior Product Manager at XO Communications. XO for former employees who haven’t been properly removed provides unified communications and cloud services. XO’s solutions help from the VoIP system or had their login privileges revoked. companies become more efficient, agile, and secure. Sheldon has exten- Eavesdropping allows malicious actors to steal classified in- sive product management and unified communications experience.

www.JAXenter.com | August 2015 33 REST

No more custom API mazes Why reusable REST APIs are changing the game

REST APIs make our lives easier – but we’re still in the dark ages when it comes to making our APIs general purpose, portable and reusable. DreamFactory evangelist Ben Busse describes some common pitfalls of hand-coding custom REST APIs and explores the architectural advantages and technical characteristics of reusable REST APIs.

by Ben Busse the proliferation of mobile devices, the modern enterprise may need hundreds or even thousands of mobile applica- Where I work at DreamFactory, we designed and built some tions. Backend integration, custom API development, back- of the very first applications that used web services on Sales- end security and testing comprise the lion’s share of a typical force.com, AWS and Azure. Over the course of ten years, enterprise mobile application project (more than half of the we learned many painful lessons trying to create the perfect time on average). RESTful backend for our portfolio of enterprise applications. Most enterprises today are woefully unable to address API When a company decides to start a new application project, complexity at its root cause. Mobile projects typically have the “business” team first defines the business requirements new requirements that were not anticipated by the existing and then a development team builds the actual software. Usu- REST APIs that are now in production. You could expand ally there is a client-side team that designs the application the scope of your existing API services, but they are already and a server-side team that builds the backend infrastructure. in production. These two teams must work together to develop a REST API So the default option is to create a new REST API for each that connects the backend data sources to the client applica- new project! The API building process continues for each tion. One of the most laborious aspects of the development process is the “interface negotiation” that occurs between these two teams (Figure 1). Project scope and functional requirements often change throughout the project, affecting API and integration requirements. The required collaboration is complex and encumbers the project.

Dungeon master development: Complex mazes of custom, handcrafted APIs You can get away with slow, tedious interface negotiation if you’re just building one simple application. But what if you need to ship dozens, hundreds or even thousands of API-driven applications for employees, partners and customers? Each application requires a backend, APIs, user management and security, and you’re on a deadline. Building one-off APIs and a custom backend for each and every new application is untenable. Mobile is forcing companies to confront this reality (or ignore it at their own peril). With the acceptance of BYOD (“bring your own device”) and Figure 1: Interface negotiation

www.JAXenter.com | August 2015 34 REST

The future: reusable REST APIs The core mistake with the API dungeon is that development activity starts with business requirements and application design, and then works its way back to server-side data sources and software development. This is the wrong direction. The best approach is to identify the data sources that need to be API-enabled and then create a comprehensive and reusable REST API platform that supports general-purpose application development (Figure 3). There are huge benefits to adopting a reusable REST API strategy.

• APIs and documentation are programmatically generated and ready to use. • There’s no need to keep building server-side software for Figure 2: The API dungeon each new application project. • Client-side application design is decoupled from security and administration. • The “interface negotiation” is simplified. • Development expenses and time to market are dramatically reduced. • Developers don’t have to learn a different API for each project. • RESTful services are no longer tied to specific pieces of infrastructure. • Companies can easily move applications between servers and from development to test to production.

Technical characteristics of a reusable API This sounds good in theory, but what are the actual technical characteristics of reusable REST APIs? And how should reusable APIs be implemented in practice? The reality is that there’s no obvious way to arrive at this development pattern Figure 3: Reusable REST APIs until you’ve tried many times the wrong way, at which point it’s usually too late. new app with various developers, consultants and contrac- DreamFactory tackled the API complexity challenge for tors. The result is custom, one-off APIs that are highly frag- over a decade, built a reusable API platform internally for mented, fragile, hard to centrally manage and often insecure. our own projects and open sourced the platform for any The API dungeon is an ugly maze of complexity (Figure 2). developer to use. We had to start from scratch many times before hitting on the right design pattern that enables our • Custom, manually coded REST APIs for every new appli- developers to build applications out of general-purpose in- cation project, written with different tools and developer terfaces. frameworks. There are some basic characteristics that any reusable • REST APIs are hardwired to different databases and file REST API should have: storage systems. • REST APIs run on different servers or in the cloud. • REST API endpoints should be simple and provide param- • REST APIs have different security mechanisms, credential eters to support a wide range of use cases. strategies, user management systems and API parameter • REST API endpoints should be consistently structured for names. SQL, NoSQL and file stores. • Data access rights are confused, user management is com- • REST APIs must be designed for high transaction volume, plex and application deployment is cumbersome. hence simply designed. • The system is difficult to manage, impossible to scale and • REST APIs should be client-agnostic and work inter- full of security holes. changeably well for native mobile, HTML5 mobile and • API documentation is often non-existent. Often, compa- web applications. nies can’t define what all the services do, or even where all of the endpoints are located. A reusable API should have the attributes below to support a wide range of client access patterns:

www.JAXenter.com | August 2015 35 REST

Figure 4: SQL API and subsets

• Noun-based endpoints and HTTP verbs are highly effective. Noun-based endpoints should be programmatically Figure 6: Request URL generated based on the database schema. • Requests and responses should include JSON or XML Parameter names should be reused across services where with objects, arrays and sub-arrays. possible. This presents developers with a familiar interface • All HTTP verbs (GET, PUT, DELETE, etc.) need to be for any data source. The API should include automatically implemented for every use case. generated, live interactive documentation that allows devel- • Support for web standards like OAuth, CORS, GZIP and opers to quickly experiment with different parameters (Fig- SSL is also important. ure 5). In general, the structure of the request URL and associated It’s crucially important to have a consistent URL structure parameters needs to be very flexible and easy to use, but also for accessing any backend data source. The File Storage API comprehensive in scope. Looking at the example below, there should be a subset of the NoSQL API, which should be a sub- is a base server, an API version, the backend database (the set of the SQL API (Figure 4). API name) and a particular table name in the request URL string. Then the parameters specify a filter with a field name, operator and value. Lastly, an additional order parameter sorts the returned JSON data array (Figure 6). A huge number of application development scenarios can be implemented just with the filter parameter. This allows any subset of data to be identified and operated on. For example, objects in a particular date range could be loaded into a calendar interface with a filter string (Figure 7). Complex logical operations should also be supported and the filter string interface needs to protect against SQL injection attacks. Other database-specific features include:

• Pagination and sorting • Rollback and commit • Role-based access controls on tables • Role-based access controls on records • Stored functions and procedures

A comprehensive reusable REST API should also support operations on ar- Figure 5: Interactive, auto-generated API docs rays of objects, but you can also spec-

www.JAXenter.com | August 2015 36 REST

ify related objects as a URL parameter. This allows complex documents to be downloaded from a SQL database and used immediately as a JSON object. The data can be edited along with the objects (Figure 8). When committed back to the server, all of the changes are updated including parent, child and junction relationships between multiple tables. This flexibility supports a huge number of very efficient data access patterns. The vast majority of application development use cases can be supported with a reusable REST API right out of the box. For special cases, a server-side scripting capability can be used to cus- tomize behavior at any API endpoint

Figure 7: Find Task records (both request and response) or create brand new custom API calls. DreamFac- tory uses the V8 JavaScript engine for this purpose. Some of the special cases that you might want to implement with server- side scripting include:

• Custom business logic • Workflow triggers • Formula fields • Field validation • Web service orchestration

Conclusion REST API complexity is an important problem for companies building API- driven applications. The tendency to build new APIs for each new project has negative consequences over time. Figure 8: Loading a project and all related tasks Adopting a REST API platform strategy with reusable and general-purpose services addresses this problem and provides many benefits in terms of more agile development and quicker time to value.

Ben Busse is a developer evangelist with DreamFactory in San Francisco. He’s passionate about open source software, mobile development and hunting wild mushrooms Figure 9: Field validation and worflow trigger in northern California.

Milliseconds matter Considering the performance factor in an API- driven world

With visitors demanding immediate response times, the fate of a website and the performance of APIs are becoming increasingly intertwined.

by Per Buer There’s an interesting analogy between APIs and the long path websites have travelled since the nineties. Back then, In recent years, web APIs have exploded. Various tech in- websites had few objects and not that many visitors so per- dustry watchers now see them as providing the impetus for formance and scalability mattered less. This has changed a whole “API economy”. As a result and in order to create a fast track for business growth, more and more companies and organizations are opening up their platforms to third parties. Preparing your API Strat- While this can create a lot of opportunities, it can also have huge consequences and pose risks. These risks don’t have to egy for IoT be unforeseen, however. Hear Per Buer speak at the JAX London: Not that long ago, API Companies’ checklists for building or selecting API man- calls were counted per hour. Evaluations for API management tools agement tools can be very long. Most include the need to of- typically have long lists of criteria, but performance is usually left fer security (both communication security -TLS- and actual off. That might be fi ne in certain environments but not where IoT API security -keys-), auditing, logging, monitoring, throttling, and mobile are concerned. For these environments the number of metering and caching. However, many overlook one critical API calls has increased to the point that even the typical rate of factor: performance. This is where you can hedge your bets 200 API calls per second is no longer enough. and plan for the potential risk.

www.JAXenter.com | August 2015 38 APIs

vice. But what is high enough to provide a competitive advan- “Back then, websites had few tage in our accelerated times? Take for example an industry like banking, where many players are opening up their plat- objects and not that many forms in a competitive bid to attract developers who create third-party apps and help monetise the data. The ones that set the API call limit too low create a bad developer experience, visitors so performance and pushing them towards friendlier environments. A limited number of API calls in web services also affects scalability mattered less.” the end-customer. Take for example online travel operators or online media. In these environments a lot of data needs dramatically over the last decade. Today, increasingly impa- to fl ow through the APIs. These are becoming more depend- tient visitors penalise slow websites by leaving quickly, and ent on fast and smooth communication between their services in many cases never returning. Microsoft computer scientist and their various apps. If these services slow down due to API Harry Shum says that sites that open just 250 milliseconds call limitations, customers will defect to faster sites. faster than competing sites – a fraction of an eye blink – will I compared the situation of APIs with that of the web ten gain an advantage. years ago when performance started to matter. The situation APIs have travelled a similar path. Ten to fi fteen years ago that actually developed is much more serious than I initially most API management tools out there had very little to do predicted. Consumers increasingly demand instant gratifi ca- and performance wasn’t an issue. The number of API calls tion. This means that the window for companies to ensure handled was often measured in calls per hour. Consequently, the performance of their APIs is closing. Being able to deliver these tools were designed to deal with things other than thou- performance and set a higher limit of API calls can make a sands of API calls per second. What a difference a decade can huge difference. Otherwise, developers will go elsewhere to make! According to Statista, worldwide mobile app down- help grow another company’s business. If you want to future- loads are expected to reach 268.69 billion by 2017. But API proof for the API boom, it’s time to consider the performance management tools haven’t caught up. Even nowadays many factor. of the products in the various vendors top right quadrant will only handle rates of 200 API calls per second per server. Their focus has been on features, not performance. If you open up your API platform, you probably want a Per Buer is the CTO and founder of Varnish Software, the company be- lot of developers to use it. However, most web services have hind the open source project Varnish Cache. Buer is a former programmer introduced a rate limit for API calls. If set high enough, the turned sysadmin, then manager turned entrepreneur. He runs, cross coun- try skis and tries to keep his two boys from tearing down the house. limit is reasonable to ensure availability and quality of ser-

Imprint

Publisher Sales Clerk: Software & Support Media GmbH Anika Stock +49 (0) 69 630089-22 Editorial Offi ce Address [email protected] Software & Support Media Saarbrücker Straße 36 Entire contents copyright © 2015 Software & Support Media GmbH. All rights reserved. No 10405 Berlin, Germany part of this publication may be reproduced, redistributed, posted online, or reused by any www.jaxenter.com means in any form, including print, electronic, photocopy, internal network, Web or any other method, without prior written permission of Software & Support Media GmbH. Editor in Chief: Sebastian Meyen Editors: Coman Hamilton, Natali Vlatko The views expressed are solely those of the authors and do not refl ect the views or position of their fi rm, any of their clients, or Publisher. Regarding the information, Publisher Authors: Kris Beevers, Per Buer, Ben Busse, Holly Cummins, Aysylu Greenberg, disclaims all warranties as to the accuracy, completeness, or adequacy of any informa- Patricia Hines, Eric Horesnyi, Werner Keil, Angelika Langer, Aviran Mordo, tion, and is not responsible for any errors, omissions, in adequacies, misuse, or the con- Chris Neumann, Lyndsay Prewer, Zigmars Raascevskis, Sheldon Smith, sequences of using any information provided by Pub lisher. Rights of disposal of rewarded Colin Vipurs, Geertjan Wielenga articles belong to Publisher. All mentioned trademarks and service marks are copyrighted Copy Editor: Jennifer Diener by their respective owners. Creative Director: Jens Mainz Layout: Flora Feher, Dominique Kalbassi