Issue April 2015 | presented by www.jaxenter.com #44

The digital magazine for enterprise developers Polyglots do it better

Speak ALL the languages! Shutterstock’s polyglot enterprise and Komodo’s polyglot IDE

RethinkDB and HBase A closer look at two database alternatives

Top performance fails Five challenges in performance management ©iStockphoto.com/epic11 Editorial

E pluribus unum

It’s an old and rather out-of-fashion motto of the United to the success of the company’s technology, and how out of States. E pluribus unum, one out of many. Early twentieth- one language became many specialist development areas, in century ideals of the cultural melting pot may have failed in turn all unifi ed by the enterprise. Meanwhile for web develop- western society. But they may still work in IT. Developers will ers looking for one tool to rule all their front-end languages, forever dream of one Holy Grail language that will rule them we’ve got a helpful guide to the polyglot IDE Komodo, which all. As nice as it sounds to be a full-stack developer, know- just unveiled its ninth release. ing your entire enterprise’s technology inside out, from the We’ve also got some useful introductions to HBase and ListView failure modes to the various components’ linguistic RethinkDB (both big salary-earners according the recent syntaxes – that’s a near superhuman talent. As much as all de- Dice. com survey), as well as web applications. And velopers would love to be fl uently multilingual, in practice it’s fi nally for anyone concerned with the speed of their website diffi cult to keep up. But instead it’s the unity of the enterprise during high traffi c, we have a couple of valuable lessons about itself that can create one whole out of many languages, not how to avoid performance failures. the individual developer. In this issue, Chris Becker explains how Shutterstock’s Coman Hamilton, gradual evolution from one to many languages was central Editor

Polyglot enterprises do it better 5 Shutterstock’s multilingual stack

Index Chris Becker An introduction to polyglot IDE Komodo 7 One IDE to rule all languages Nathan Rijksen HBase, the “Hadoop database” 12 A look under the hood Ghislain Mazars An introduction to building realtime apps with RethinkDB 16 First steps Ryan Paul Performance fails and how to avoid them 20 The fi ve biggest challenges in application performance management Klaus Enzenhofer Separating UI structure and logic 22 Architectural differences in Vaadin web applications Matti Tahvonen Model View ViewModel with JavaFX 24 A look at mvvmFX Alexander Casall

www.JAXenter.com | April 2015 2 featuring

October 12 – 14th, 2015 Business Design Centre, London MARK YOUR CALENDAR!

www.jaxlondon.com

Presented by Organized by Hot or Not

HTML6 If you’re a web developer that doesn’t check the news that often, make sure you’re sitting before reading this. The web community is furiously debating a radical new proposal for HTML6. The general idea is that the next version of HTML will be developed in a way that allows it to dynamically run single-page apps without ­ Script. Yes, a HTML that wants to compete with JavaScript. The community is torn, but proposal author Bobby Mozumder makes an interesting case, claiming that HTML needs to follow the “standard design pattern emerging via all the front-end JavaScript frameworks where content is loaded dynamically via JSON APIs.” He’s won over quite a few members of the community, but there’s no telling if the W3C will ever lower its eyebrows to this request.

Cassandra salaries It’s often considered impolite to ask another how much they earn. But that doesn’t mean colleagues, recruiters and employers aren’t picturing an estimated annual salary hovering over your head while they pretend to listen to you. It turns out that right now, Cassandra, PaaS and MapReduce pros are the ones with the biggest dollar signs above their heads (according to the latest Dice.com survey). Anyone lucky enough to be an expert in this area (and living in the US) will be making an average of at least $127k per year. America’s Java developers will be lucky to scrape anything above $100k, while poor JavaScript experts earn as little as $94k a year. We’ like to invite you to join us in performing the world’s smallest violin concerto for the JS com- munity – because seriously that’s still a mighty shedload of cash to the kinds of people that write the texts you’re reading [long sigh].

How banks treat Publishing a book technology in IT In a recent series of interviews on JAX- At some stage in their career, most enter, developers in finance have told developers will ponder the idea of us about the pros and cons of develop- publishing a book. And the appeal is ing for banks. And while the salary understandable. It’s a great milestone is very definitely a major pro (banks on your CV, developers in the pay 33 percent to 50 percent more, community may look up to you and says HFT developer Peter Lawrey), the you’ll be sure to make your parents management-level attitude to technol- proud (even if they don’t have a clue ogy is less than favourable. “Many what you’re writing about). But financial companies see their develop- don’t expect to make big money (like ment units as a necessary evil,” says those Cassandra developers, above) former NYSE programmer Dr. Jamie when you self-publish on Leanpub. Allsop. Not only does technology come Meanwhile you’ll need to be dedicating across as being second-class, but it’s much of your spare time to promoting often impossible to drive innovation the book. Before your eyes go lighting when working on oldschool banking up with dollar signs at the thought systems. “If it’s new then it must be of becoming a wealthy international risky,” dictates the typical financial ops superstar IT author, give yourself a attitude according to finance techno­ quick reality check with a couple of logy consultant Mashooq Badar. Google searches on IT publishing.

www.JAXenter.com | April 2015 4 Languages iStockphoto.com/jamtoons iStockphoto.com/jamtoons ©

Shutterstock’s multilingual stack Polyglot enterprises do it better

Being a multilingual, multicultural company doesn’t just bring benefits on a level of corporate culture. Shutterstock search engineer Chris Becker explains why enterprises need to stop speaking just one language.

by Chris Becker But in recent years, Shutterstock has grown more “multilin- gual.” Today, we have services written in Node.js, Ruby, and A technology company must consider the technology stack Java; data processing tools written in Python; a few of our and programming language that it’s built on. Everything else sites written in PHP; and apps written in Objective-C. flows from those decisions. It will define the kinds of engi- Developers specialize in each language, but we communi- neers who are hired and how they will fare. cate across different languages on a regular basis to debug, During Shutterstock’s early days, a team of develop- write new features, or build new apps and services. Getting ers built the framework for the site. They chose Perl for the from there to here was no easy task. Here are a few strate- benefits of CPAN and flexibility that language would bring. gic decisions and technology choices that have facilitated our However, it opened up a new problem – we could only hire evolution: those familiar with Perl. We hired some excellent and skilled engineers, but many others were left behind because of their Service-oriented Architectures inexperience with and lack of exposure to Perl. In a way, it First, we built out all our core functionality into services. limited our ability to grow as a company. Each service could be written in any language while provid-

www.JAXenter.com | April 2015 5 Languages

Developer Meetups We also want our developers to learn and evolve their skill- “The biggest obstacle blocking sets. We host internal meetups for Shutterstock’s Node de- velopers, PHP Developers, and Ruby developers where they new developers is often the can get fresh looks and feedback on code in progress. These meetups are a great way for them to continue professional technical bureaucracy needed development and also to meet others who are tackling similar projects. There’s no technology to replace face-to-face com- to manage each runtime.” munication, and what great ideas and methods can come from it.

ing a language-agnostic interface through REST frameworks. Openness It has allowed us to write separate pieces of functionality in We post all the code for every Shutterstock site and service the language most suited to it. For example, search makes use on our internal GitHub. Everyone can find what we’ve been of Lucene and Solr, so Java made sense there. For our trans- working on. If you have an idea for a feature, you can fork lation services, Unicode support is highly important, so Perl off a branch and submit a pull request to the shepherd of was the strongest option. that service. This openness goes beyond transparency; it en- courages people to try new things and to see what others are Common Frameworks implementing. Between languages there are numerous frameworks and Our strategies and tools our mission. We want engi- standards that have been inspired or replicated by one anoth- neers to think with a language-agnostic approach to better er. When possible, we try to use one of those common tech- work across multiple languages. It helps us to dream bigger nologies in our services. All of our services provide RESTful and not get sidelined by limitations. As the world of program- interfaces, and internally we use Sinatra-inspired frameworks ming languages becomes much more fragmented, it’s becom- for implementing them ( for Perl, Slim for PHP, Ex- ing more important than ever from a business perspective to press for Node etc.). For templating we use -inspired develop multilingual-friendly approaches. frameworks such as Template::Swig for Perl, Twig for PHP, We’ve come a long way since our early days, but there’s still and Liquid for Ruby. By using these frameworks we can help a lot more we can do. We’re always reassessing our process improve the learning curve when a developer jumps between and culture to make both the code and system more familiar languages. to those who want to come work with us.

Runtime Management This has been adapted from a post that originally ran on Often, the biggest obstacle blocking new developers is tech- the Shutterstock Tech blog. You can read the original here: nical bureaucracy needed to manage each runtime – man- bits.shutterstock.com/2014/07/21/stop-using-one-language/. aging library dependencies, environment paths, and all the command line settings and flags needed to do common tasks. Shutterstock simplifies all of that with Rockstack. Rockstack provides a standardized interface for building, running, and testing code in any of its supported runtimes (currently: Perl, PHP, Python, Ruby, and Java). Not only does Rockstack give our developers a standard interface for building, testing, and running code, but it also supplies our build and deployment system with one stand- ard set of commands for running those operations as well for any language. Rockstack is used by our Jenkins cluster for running builds and tests, and our home-grown deployment system makes use of it for launching applications in dev, QA, and production.

Testing Frameworks In order to create a standardized method for testing all the services we have running, we developed (and open sourced!) NTF (Network Testing Framework). NTF lets us write tests that hit special resources on our services’ APIs to provide sta- tus information that show the service is running in proper Chris Becker is the Principal Engineer of Search at Shutterstock where form. NTF supplements our collection of unit and integration he’s worked on numerous areas of the search stack including the search tests by constantly running in production and telling us if any platform, Solr, relevance algorithms, data processing, analytics, interna- functionality has been impaired in any of our services. tionalization, and customer experience.

www.JAXenter.com | April 2015 6 Languages

One IDE to rule all languages An introduction to polyglot IDE Komodo

The developer world is becoming “decidedly more polyglot”, Komodo CTO recently told JAXenter. To cater to this changing community, the multi-lingual IDE Komodo is steadily increasing its number of supported languages. Let’s take a closer look at what it does best and how to get started.

by Nathan Rijksen Type in your search query, and Commando will show you results in real-time. Select an entry and press “Enter” to open/ A few weeks ago, Komodo IDE 9 was released, featuring a activate it, or press tab/right arrow to “expand” the selection, host of improvements and features. Many of you have since allowing you to perform contextual actions on it. For exam- downloaded the 21-day trial and have likely spent some time ple, you could rename a file right from Commando (Figure 3) learning more about what this nimble beast can do. I thought using nothing but your keyboard. Commando doesn’t need now would be a good time to run you through some sim- to be accessed from the toolbar – you can use a shortcut to ple workflows and give you a general idea of how Komodo launch it so you can also be 100 percent keyboard driven works. This is just a short introduction, and much more in- (CTRL + Shift + O on Windows and , or CMD + Shift formation can be found on the Komodo website under screen- + O in Mac). casts, forums and of course the documentation.

User Interface Quick Start Figure 1 shows what you’ll see when you first launch Komodo. The first few icons on the toolbar are pretty selfexplanatory, ­ and the ones that aren’t immediately self-explanatory are ­easily discovered by hovering your mouse over them. You’ll find quick access to a lot of great Komodo features here: de- bugging, regex testing, source code control, macro recording/ playback, etc. Of course these aren’t all displayed by default; you would need a big screen to show all that. You can easily toggle individual buttons or button groups by right clicking the toolbar and selecting “Customize”. Three buttons you’ll likely be using a lot are the pane toggles (Figure 2). Clicking these allows you to toggle a bottom, left and right pane, each of which holds a variety of “widgets” that allow Figure 1: Komodo start screen you to do anything from managing your project files to unit testing your project. You can customize the layout of these widgets by right-clicking their icon in the panel tab bar. At the far right of the toolbar you’ll find a search field, dubbed Commando, which lets you easily search through your project files, tools, bookmarks, etc. Figure 2: Pane buttons

www.JAXenter.com | April 2015 7 Languages

Figure 3: Commando Figure 4: The “burger menu”

Starting a Project project folder that can hold your project specific tools. You The first thing you’ll want to do is start a new project. You can separate your project from your source code by creating don’t need to use projects, but it’s highly encouraged as it your project in the preferred location and afterwards modify- gives Komodo some context: “This is what I’m working with, ing the project source in the “Project Preferences”. and this is how I like working with this particular project.” Once your project is created you might want to do just that: To start a new project, simply click the “Project” menu adjust some of the project specific preferences. To do this, and then select “New Project”. On Windows and Linux the open your “Project” menu again and select “Project Prefer- menus are at the far right of the toolbar under the “burger ences” (Figure 5), or use Commando to search for “Project menu” (Figure 4), as people tend to lovingly call it. Properties”. There you can customize your project to your You can also start the project from the “Quick Launch” liking, changing it’s source, exclude files, using custom inden- page which is visible if no other files are opened, or you can tation, and more. It’s all there. use Commando by searching for “New Project” and selecting If you’re working with a lot of different projects, you’ll the relevant result (it should show up first). definitely want to check out the Projects widget at the bottom When creating a project, Komodo will ask you where to of the Places widget. Open the left side pane and select the save your project; this will be the project source, and Ko- “Places” widget, then look at the bottom of the opened pane modo will create a single project file in this folder and one for “Projects”.

Figure 5: Project properties Figure 6: Scope

www.JAXenter.com | April 2015 8 Languages

Figure 7: Open files Figure 8: Snippet

Opening (and Managing) Files with a custom JavaScript macro you can even create your You now have your project and are ready to start opening own groups and sorting options. and editing some files. The most basic way of doing that is by using the “Places” widget in the left pane, but where’s the Using Snippets fun in that? Again, you could use Commando to open your Many use snippets (also referred to as Abbrevi- files… Simply launch Commando from your toolbar or via ations in Komodo), and many editors facilitate this. Komodo the shortcut and type in your filename. Commando by de- again goes a step further by allowing very fine-tuned control fault searches across several “search scopes”, so you may get of snippets. First let’s just use a simple snippet though. results for tools, macros, etc. If you want to focus down on Open a new file for your preferred language. We’ll use PHP just your files you can hit the icon to the left of the search field in our example (because so many people do, I’ll refrain from to select a specific “search scope”. In this case you’ll want the naming my personal favourite). Komodo comes with some Files scope (Figure 6). pre-defined snippets you can use, but let’s define our own. You can define custom shortcuts to instantly launch a spe- Open your right pane and look for the “Samples” folder; cific Commando scope, so you don't need to be using this these are the samples Komodo provides for you to get started scope menu with your mouse each time if that’s not your pre- with. I would suggest you cut and paste the “Abbreviations” ferred method. folder residing in this folder to be at the root of your toolbox, Now that you’ve opened a couple of files, you may be start- as it provides a good starting structure. Once that’s done, ing to notice that your tab bar is getting a bit unwieldy. This head into Abbreviations | PHP; these are abbreviations that is nothing new to editors, and a lot of programmers either only get triggered when you are using a PHP file. Right-click deal with it or constantly close files that aren’t immediately the PHP folder and select Add | New Snippet. relevant anymore. Some editors get around this by giving you Here you can write your snippet code. Let’s write a “pri- another way of managing opened files; luckily Komodo is one vate function” snippet. Choose a name for your snippet: this of these editors, and even goes a step further. We’ve already will be the abbreviation that will trigger the snippet, and I’ll spoken about Commando a lot so I’ll skip past the details and use “prfunc” because I like to keep it short and simple. Then just say “There’s an Opened Files search scope.” A more UI- write in your code. You can use the little arrow menu to the driven method is available through the “Open Files” widget right of the editor to inject certain dynamic values. Most rel- (Figure 7), accessible under the left pane right next to the evant right now is the “tabstop” value, which tells Komodo “Places” widget (you’ll have to click the relevant tab icon at you want your cursor to stop and enter something there. Fig- the top of the pane). ure 8 shows what my final snippet looks like. The Open Files widget allows you to, well, manage your You’ll note I checked “Auto-Abbreviation”. This will allow open files. But more than that it allows you to manage how me to trigger the snippet simply by writing its abbreviation. you manage open files – talk about meta! When you press the The actual trigger for this is configurable, so you can instead “Cog” icon you will be presented with a variety of grouping have it trigger when you press Tab. Now we have our snippet and sorting options, allowing you to group and sort your files and are ready to start using it. Simply write “prfunc”. Next, the way you want. If you’re comfortable getting a bit dirty let’s talk about version control, wait ... What? You were

www.JAXenter.com | April 2015 9 Languages

create your own macros, which – aside from extending your editor – can be used to do anything from changing UI font size to writing your own syntax checker, or overhauling the entire UI. You have direct access to the full Komodo API that is used to develop Komodo. That’s a whole different topic though … One far too big to get into now, but be sure to have a look at all that the “Toolbox” offers you, because snippets are just the tip of the iceberg.

Previewing Changes So, you’ve created some files, edited them (using snippets I hope – that took a while to write down, you know!) and now want to see the results. Rather than leaving Komodo for your terminal or your browser, why not just do it from inside Ko- modo? Since it was a PHP file we were working on, you can simply Figure 9: Debug launch your code by pressing the “Start or continue debug- ging” button in your toolbar. This will of course run your code through a debugger but it’s also useful just to run your code and see the output. You could skip the debugger altogether by using the Debug | Run Without Debugging menu. As- suming your PHP file actually outputs anything, Komodo will open the bottom pane and show you your code output. Out- put not what you expected it to be? Set a few breakpoints and jump right into the middle of your code (Figure 9). You could take it even further and start editing variables while debugging, starting a REPL from the current break- point, etc, provided the language you are using supports this functionality. I did say browser back there though, so what about pre- viewing HTML files? Simply open your HTML file and hit the “Browser Preview” button in your toolbar. Select “In a Figure 10: Preview markdown Komodo Tab” for the “Preview Using” setting and customize the rest to your liking, then hit Preview. The preview will be rendered using Gecko. Komodo is built on Mozilla so basi- cally it’s rendering using Firefox, right from Komodo. With Komodo 9, you can now even preview Markdown files in real-time. Just open a Markdown file and hit the “Pre- view Markdown” button in your toolbar to start previewing (Figure 10).

Using Version Control If you’re a programmer, you probably use version control of some kind (and if not – you really should!) whether it be Git, Mercurial, SVN, Perforce or whatever else strikes your fancy. Komodo doesn’t leave you hanging there; you can easily access a variety of VCS tasks right from inside Komodo (Figure 11). Let’s assume you already have a repository checked out/ cloned. We’ll go straight to the most basic and most frequent- ly used VCS task – committing. Make your file edits, and sim- Figure 11: VCS ply hit the VCS toolbar button (you may need to enable this first by right clicking the toolbar and hitting “Customize”). expecting more? No that’s it, you type “prfunc” and it will Then select “Commit”, enter your commit message and fire auto-trigger your snippet. It’s that easy. it off. Komodo will show you the results (unless you disabled There’s many other ways of triggering and customizing this) in the bottom pane. snippets (ahem, Commando again); you can even create snip- Komodo IDE 9 even shows you the changes to files as you pets that use EJS so you can add your own black magic to the are editing them, meaning it will show what was added, ed- mix. And for use cases where even that isn’t enough, you can ited and deleted since your last commit of that file (Figure 12).

www.JAXenter.com | April 2015 10 Languages

“Appearance” and “Color Scheme” sections are probably of particular interest to you. Note that by default Komodo hides all the advanced preferences that you likely wouldn’t want to see unless you are a bit more familiar with Komodo. You can toggle these advanced preferences by checking the “Show Advanced” checkbox at the bottom left of your Preferences dialog (Figure 13). Inevitably some of you will find something that simply isn’t in Komodo though, because there isn’t a single IDE out there that has ALL the things for EVERYONE. When this happens, the community and customizability of Komodo is there for you, so be sure to check out the variety of addons, macros, color schemes etc. at the Komodo Resources website, and if you want to get creative, share ideas or request features from the community, then head on over to the Komodo forums.

Just a Glimpse Hopefully I’ve given you a glimpse of what Komodo has to Figure 12: Track changes offer, or at least given you an idea of how to get started with it. Komodo is a great IDE, even if you aren’t in the market for a huge IDE, and the best thing about it is you don’t need to buy/launch another IDE each time you want to work on another language. Komodo has you covered for a variety of languages and frameworks, including (but not limited to!) Python, PHP, Go, Ruby, Perl, Tcl, Node.js, Django, HTML, CSS and JavaScript. Enjoy!

Figure 13: Preferences

The left margin (with your line numbers) will become colored when edits are made. Clicking on one of these colored blocks will open a small pop-in showing what was changed and allowing you to revert the change or share the changeset via kopy.io. Kopy.io is a new tool created by the Komodo team, which serves as a modernized pastebin, implemeting nifty new features you probably havent seen elsewhere such as client-side encrypted “kopies” and auto-sizing text to fit to your browser window.

Customizing You’ve now gone through a basic workflow and are start- ing to really warm up to Komodo (hopefully), but you wish your color scheme was light instead of dark, or the icon set is too monotone and grey for your taste, or ... or ... Again Nathan Rijksen, Komodo developer, has web dev expertise and experi- Komodo’s got you covered. Head into the Preferences dialog ence as a backend architect, application developer and database engi- neer, and has also worked with third-party authentication and payment via Edit | Preferences or Komodo | Preferences on OSX. modules. Nathan is a long time Komodo user and wrote multiple macros Here you can customize Komodo to your heart’s content: the and extensions before joining the Komodo team.

www.JAXenter.com | April 2015 11 Databases

A look under the hood HBase, the “Hadoop database”

Even MongoDB has its limits. And where the favourite of the database world reaches its limits in ­scalability, that’s just where HBase enters. Tech entrepreneur Ghislain Mazars shows us the strong points of this community-driven open-source database.

by Ghislain Mazars It is worth noting however that in the process, Cassandra has lost a lot of its open-source nature. 80 percent of the HBase and the NoSQL Market: In the myriad of NoSQL committers on the Apache project are from Datastax and the databases today available on the market, HBase is far from management features beloved by enterprise customers are having a comparable share to market leader MongoDB. Easy proprietary and part of DSE (“DataStax Enterprise”). Going to learn, MongoDB is the NoSQL darling of most applica- one step further, the integration with Apache , the new tion developers. The document-oriented database interfaces whizz-kid of Big Data, is currently only available as part of well with lightweight data exchanges format, typically JSON, DSE … and has become the natural NoSQL database choice for many web and mobile apps (Figure 1). HBase, a community-driven open-source project Where MongoDB (and more generally JSON databases) Unlike Cassandra, HBase very much remains a community- reaches its limits is for highly scalable applications requiring driven open-source project. No less than 12 companies are complex data analysis (the oft denominated “data-intensive” represented in the Apache project committee and the three applications). That segment is the sweet spot of column-ori- Hadoop distributors, Cloudera, Hortonworks and MapR, ented databases such as HBase. But even in that particular category, HBase has lately oft been overlooked in favour of Cassandra. Quite a surprising turn of events actually as Face- book, the “creator” of Cassandra, ditched its own creation in 2011 and selected HBase as the database for its Messages ap- plication. We will come back to the technical differences be- tween the two databases, but the main reason for ­Cassandra’s remarkable comeback is to be found elsewhere.

Cassandra, the comeback kid of NoSQL databases With Cassandra, we find a pattern common to most major NoSQL databases, i.e. the presence of a dedicated corporate sponsor. Just as MongoDB (with MongoDB Inc, formerly called 10gen) and Couchbase (with Couchbase Inc.), the tech- nical and market development of Cassandra is spearheaded by Datastax Inc. From a continued effort on documentation (Planet Cassandra) to the stewardship of the user community with meetups and summits, Datastax has been doing a re- markable job in waving high the Cassandra flag. These efforts have paid off, and Cassandra now holds the pole position among wide-column databases. Figure 1: Relative adoption of NoSQL skills

www.JAXenter.com | April 2015 12 Databases

Figure 2: HBase schema design; source: Introduction to HBase Schema Design (Amandeep Khurana) share the responsibility of marketing the database and sup- for example that a Gmail user instantaneously retrieves all porting its corporate users. its latest emails. As a result, HBase sometimes lacks the marketing firepow- Just like BigTable, HBase is designed to handle massive er of one company betting its life on the product. If it had amounts of data and is optimized for read-intensive applica- been the case, no doubt that HBase would be in a 1.x release tions. The database is implemented on top of the Hadoop by now: while Hadoop made a big jump from 0.2x to 1.0 in Distributed File System (HDFS) and takes advantage of its 2011, HBase continued to move steadily in the 0.9x range! linear scalability and fault tolerance. But the integration with And the three companies including the database in their port- Hadoop does not stop at using HDFS as the storage layer: folio show a tendency to privilege other (more proprietary) of- HBase shares the same developer community as Hadoop and ferings of theirs and thus provide a restrictive image of HBase. offers native integration with Hadoop MapReduce. HBase In this context, it is quite an achievement that HBase oc- can serve as both the source or the destination of MapReduce cupies such an enviable place among NoSQL databases. It jobs. The benefit here is clear: there is no need for any data owes this position to its open-source community, strong in- movement between batch MapReduce ETL jobs and the host stalled base within web properties (Facebook, Yahoo, Grou- operational and analytics database. pon, eBay, Pinterest) and distinctive Hadoop connection. So in spite or maybe thanks to its unusual model, HBase could HBase schema design still very much win… As Cassandra has shown in the last 2/3 HBase offers advanced features to map business problems to years, things can move fast in this market. But we will come the data model, which makes it way more sophisticated than back to that later on, for now, let us take a more technical a plain key-value store such as Redis. Data in HBase is placed look at HBase. in tables, and the tables themselves are composed of rows, each of which has a rowkey. Under the Hood The rowkey is the main entry point to the data: it can be Hadoop implementation of Google’s BigTable: HBase is seen as the equivalent of the primary key for a traditional an open-source implementation of BigTable as described RDBMS database. An interesting capability of HBase is that in the 2005 paper from Google (http://research.google. its rowkeys are byte arrays, so pretty much anything can serve com/archive/bigtable.html). Initially developed to store as the rowkey. As an example, compound rowkeys can be cre- crawling data, BigTable remains the distributed database ated to mix different criteria into one single key, and optimize technology underpinning some of Google’s most famous data access speed. services, Google Docs & Gmail. Of course, as should be In pure key-value mode, a query on the rowkey will give expected from a creation of Google, the wide-column da- back all the content of the row (or to take a columnar view, tabase is super scalable and works on commodity servers. all of its columns). But the query can also be much more pre- It also features extremely high read performance, ensuring cise, and specifically address (Figure 2):

www.JAXenter.com | April 2015 13 Databases

• A family of columns • A specific column, and as a result a cell which is the inter- section of a row and a column “HBase tends to favour • Or even a specific version of a cell, based on a timestamp consistency over Combined, these different features greatly improve the base key-value model. With one constraint, the rowkey cannot be availability.” changed, and should thus be carefully selected at design stage to optimize row-key access or scan on a range of rowkeys. But beyond that, HBase offers a lot of flexibility: new columns can be added on the fly, all the rows do not need to contain Strong consistency the same columns (which makes it easy to add new attributes In its overall design, HBase tends to favour consistency over to an existing entity) and nested entities provide a way to availability. It even supports ACID-level semantics on a per- define relationships within what otherwise remains a very flat row basis. This of course has an impact on write performance, model. which will tend to be slower than comparable consistent data- bases. But again, typical use cases for HBase are focused on a Cool Features of HBase high read performance. Sorted rowkeys: Manipulation of HBase data is based on Overall, the trade-off plays in favour of the application three primary methods: Get, Put, and Scan. For all of them, developer, who will have the guarantee that the datastore access to data is done by row and more specifically accord- always (vs eventually...) delivers the right value of the data. ing to the rowkey. Hence the importance of selecting an ap- In effect, the choice of delivering strong consistency frees propriate rowkey to ensure efficient access to data. Usually, the application developer from having to implement cum- the focus will be on ensuring smooth retrieval of data: HBase bersome mechanics at the application level to mimic such a is designed for applications requiring fast read performance, guarantee. And it is always best when the application devel- and the rowkey typically closely aligns with the application’s oper can focus on the business logic and user experience vs access patterns. the plumbing ... As scans are done over a range of rows, HBase lexicographi- cally orders rows according to their rowkeys. Using these What’s next for HBase? “sorted rowkeys”, a scan can be defined simply from its start In the first section, we had a look at HBase's position in the and stop rowkeys. This is extremely powerful to get all relevant wider NoSQL ecosystem, and vis-à-vis its most direct com- data in one single database call: if we are only interested in the petitor, Cassandra. In our second and third sections, we most recent entries for an application, we can concatenate a reviewed the key technical characteristics of HBase, and high- timestamp with the main entity id to easily build an optimized lighted some key features of HBase that make it stand out request. Another classical example relates to the use of geo- from other NoSQL databases. In this final section, we will hashed compound rowkeys to immediately get a list of all the discuss recent initiatives building out on these capabilities and nearby places for a request on a geographic point of interest. the chances of HBase becoming a mainstream operational da- tabase in a Hadoop-dominated environment. Control on data sharding In selecting the rowkey, it is important to keep in mind that Support for SQL with Apache the rowkey strongly influences the data sharding. Unlike tra- Until recently, HBase did not offer any kind of SQL-like in- ditional RDBMS databases, HBase provides the application teraction language. That limitation is now over with Apache developer with control on the physical distribution of data Phoenix, an open-source initiative for ad hoc querying of across the cluster. Column families also have an influence (all HBase. column members for a family share the same prefix), but the Phoenix is an SQL skin for HBase, and provides a bridge primary criteria is the rowkey to ensure data is evenly distrib- between HBase and a relational model and approach to ma- uted across the Hadoop cluster (data is sorted in ascending nipulate data. In practice, Phoenix compiles SQL queries to order by rowkey, column families and finally column key). As native HBase calls using another recent novelty of HBase, co- rowkeys determine the sort order of a table’s row, each region processors. Unlike standard Hadoop SQL tools such as Hive, in the table ends up being responsible for the physical storage Phoenix can both read and write data, making it a more ge- of a part of the row key space. neric and complete HBase access tool. Such an ability to perform physical-level tuning is a bit unu- sual in the database world nowadays, but immensely power- Further integration with Hadoop and Spark ful if the application has a well-defined access pattern. In such Over time, Hadoop has evolved from being mainly a HDFS cases, the application developer will be able to guide how the + MapReduce batch environment to a complete data plat- data is spread across the cluster and avoid any hotspotting by form. An essential part of that transformation has been the skillfully selecting the rowkey. And, at the end of the day, disk advent of YARN, which provides a shared orchestration access speed matters from an application usability perspec- and resource management service for the different Hadoop tive, so it is really good to have some control on it! components. With the delivery of project Slider end of 2014,

www.JAXenter.com | April 2015 14 Databases

position as the “Hadoop database”. The imperative of limiting data move- ments will play strongly in its favour as enterprises start building complete data pipelines on top of their Hadoop “data lake”. In parallel, HBase is a strong contend- er for emerging use cases such as the management of IoT-related time series data. Incidentally, the recent launch by Microsoft of a HBase as a Service of- fering on Azure should be read in that context. For these reasons, there is no doubt that HBase will continue to grow stead- ily over the next few years. Still the opportunity is here for more, and for HBase to have a much bigger impact on Figure 3: Spark Hadoop integration the enterprise market. MapR has in this perspective recently made a promising move by incorporating HBase cluster resource utilisation can now be “controlled” its HBase-derived MapR-DB in its free community edition. from YARN, making it easier to run data processing jobs and For their part, Hortonworks and Cloudera have been active HBase on the same Hadoop cluster. on the essential integrations with Slider and Spark. Now is With a different spin, the ongoing integration work behind the time for the HBase community and vendors to move to HBase and Spark also contributes to the unification of da- the next stage, and drive a rich enterprise roadmap for the tabase operations and analytic jobs on Hadoop. Just as for “Hadoop database”, to make HBase sexy and attractive for MapReduce, Spark can now utilize HBase as both a data mainstream enterprise customers! source and a target. With nearly 2/3 of users loading data into Spark via HDFS, HBase is the natural database to host low-latency, interactive applications from within a Hadoop cluster. Advanced analytics provided by Spark can be fed back directly into HBase, delivering a closed-loop system, fully integrated with the Hadoop platform (Figure 3). Ghislain Mazars is a tech entrepreneur and founder of Ubeeko, the Final thoughts company behind HFactory, delivering the application stack for Hadoop and HBase. He is fascinated by the wave of disruption brought by data- With Hadoop moving from exploratory analytics to opera- driven businesses, and the underlying big data technologies underpin- tional intelligence, HBase is set to further benefit from its ning this shift.

Advert

www.JAXenter.com | April 2015 15 Databases

First steps An introduction to build- ing realtime apps with RethinkDB

Built for scalability across multiple machines, the JSON document store RethinkDB is a distributed da- tabase that uses an easy query language. Here’s how to get started.

by Ryan Paul By default, RethinkDB creates a database named test. Let’s start by adding a table to the testdatabase: RethinkDB is an open source database for building realtime web applications. Instead of polling for changes, the devel- r.db("test").tableCreate("fellowship") oper can turn a query into a live feed that continuously pushes updates to the application in realtime. RethinkDB’s streaming Now, let’s add a set of nine JSON documents to the table updates simplify realtime backend architecture, eliminating (Listing 1). superfluous plumbing by making change propagation a native When you run the command above, the database will out- part of your application’s persistence layer. put an array with the primary keys that it generated for all of In addition to offering unique features for realtime applica- the new documents. It will also tell you how many new re- tion development, RethinkDB also benefits from some useful cords it successfully inserted. Now that we have some records characteristics that contribute to a pleasant developer experi- in the database, let’s try using ReQL’s filter command to fetch ence. RethinkDB is a schemaless JSON document store that the fellowship’s hobbits: is designed for scalability and ease of use, with easy shard- ing, support for distributed joins, and an expressive query r.table("fellowship").filter({species:"hobbit"}) language. This tutorial will demonstrate how to build a realtime web The filter command retrieves the documents that match the application with RethinkDB and Node.js. It will use Socket. io provided boolean expression. In this case, we specifically to convey live updates to the frontend. If you would like to want documents in which the species property is equal to follow along, you can install RethinkDB or run it in the cloud.

First steps with ReQL Listing 1 The RethinkDB Query Language (ReQL) embeds itself in the r.table("fellowship").insert([ programming language that you use to build your applica- { name: "Frodo", species: "hobbit" }, tion. ReQL is designed as a fluent API, a set of functions that { name: "Sam", species: "hobbit" }, you can chain together to compose queries. { name: "Merry", species: "hobbit" }, Before we start building an application, let’s take a few { name: "Pippin", species: "hobbit" }, minutes to explore the query language. The easiest way to { name: "Gandalf", species: "istar" }, experiment with queries is to use RethinkDB’s administrative { name: "Legolas", species: "elf" }, console, which typically runs on port 8080. You can type Re- { name: "Gimili", species: "dwarf" }, thinkDB queries into the text field on the Data Explorer tab { name: "Aragorn", species: "human" }, and run them to see the output. The Data Explorer provides { name: "Boromir", species: "human" } auto-completion and syntax highlighting, which can be help- ]) ful while learning ReQL.

www.JAXenter.com | April 2015 16 Databases

hobbit. You can chain additional commands to the query if you want to perform more operations. For example, you can Listing 2 use the following query to change the value of the species var r = require("rethinkdb"); property for all hobbits: r.connect().then(function(conn) { r.table("fellowship").filter({species: "hobbit"}) return r.db("test").table("fellowship") .update({species: "halfling"}) .filter({species: "halfling"}).run(conn) .finally(function() { conn.close(); }); ReQL even has a built-in HTTP command that you can use }) to fetch data from public web APIs. In the following example, .then(function(cursor) { we use the HTTP command to fetch the current posts from a return cursor.toArray(); popular subreddit. The full query retrieves the posts, orders }) them by score, and then displays several properties from the .then(function(output) { top five entries: console.log("Query output:", output); }) .error(function(err) { r.http("http://www.reddit.com/r/aww.json")("data")("children")("data") console.log("Failed:", err); .orderBy(r.desc("score")).limit(5).pluck("score", "title", "url") }); As you can see, ReQL is very useful for many kinds of ad hoc data analysis. You can use it to slice and dice complex JSON data structures in a number of interesting ways. If you’d like Listing 3 to learn more about ReQL, you can refer to the API reference var app = require("express")(); documentation, the ReQL introduction on the RethinkDB var r = require("rethinkdb"); website, or the RethinkDB cookbook. app.listen(8090); Use RethinkDB in Node.js and Express console.log("App listening on port 8090"); Now that you’re armed with a basic working knowledge of ReQL, it’s time to start building an application. We’re going app.get("/fellowship/species/:species", function(req, res) { to start by looking at how you can use Node.js and Express to r.connect().then(function(conn) { return r.db("test").table("fellowship") make an API backend that serves the output of a ReQL query .filter({species: req.params.species}).run(conn) to your end user. .finally(function() { conn.close(); }); The rethinkdb module in npm provides RethinkDB’s offi- }) cial JavaScript client driver. You can use it in a Node.js appli- .then(function(cursor) { return cursor.toArray(); }) cation to compose and send queries. The following example .then(function(output) { res.(output); }) shows how to perform a simple query and display the output .error(function(err) { res.status(500).json({err: err}); }) (Listing 2). }); The connect method establishes a connection to Rethink- DB. It returns a connection handle, which you provide to the run command when you want to execute a query. The ex- Listing 4 ample above finds all of the halflings in the fellowship table and then displays their respective JSON documents in your r.connect().then(function(c) { console. It uses promises to handle the asynchronous flow of return r.db("test").table("fellowship").changes().run(c); }) execution and to ensure that the connection is properly closed .then(function(cursor) { when the operation completes. cursor.each(function(err, item) { Let’s expand on the example above, adding an Express console.log(item); server with an API endpoint that lets the user fetch all of the }); fellowship members of the desired species (Listing 3). }); If you have previously worked with Express, the code above should look fairly intuitive. The final path segment in the URL route represents a variable, which we pass to the filter com- pending the changes command to the end of a ReQL query. mand in the ReQL query in order to obtain just the desired doc- The changes command creates a changefeed, which will give uments. After the query completes, the application relays the you a cursor that receives new records when the results of the JSON output to the user. If the query fails to complete, then the query change. The following code demonstrates how to use a application will return status code 500 and provide the error. changefeed to display table updates (Listing 4). The cursor.each callback executes every time the data with- Realtime updates with changefeeds in the fellowship table changes. You can test it for yourself RethinkDB is designed for building realtime applications. by making an arbitrary change. For example, we can remove You can get a live stream of continuous query updates by ap- Boromir from the fellowship after he is slain by orcs:

www.JAXenter.com | April 2015 17 Databases

r.table("fellowship").filter({name:"Boromir"}).delete() A realtime scoreboard Let’s consider a more sophisticated example: a multiplayer When the query removes Boromir from the fellowship, the game with a leaderboard. You want to display the top five demo application will display the following JSON data in std- users with the highest scores and update the list in realtime as out (Listing 5). it changes. RethinkDB changefeeds make that easy. You can When changefeeds provide update notifications, they tell attach a changefeed to a query that includes theorderBy and you the previous value of the record and the new value of the limit commands. Whenever the scores or overall composition record. You can compare the two in order to see what has of the list of top five users changes, the changefeed will give changed. When existing records are deleted, the new value you an update. is null. Similarly, the old value is null when the table receives Before we get into how you set up the changefeed, let’s start new records. by using the Data Explorer to create a new table and populate The changes command currently works with the following it with some sample data (Listing 6). kinds of queries: get, between, filter, map, orderBy, min, and Creating an index helps the database sort more efficiently max. Support for additional kinds of queries, such as group- on the specified property – which is score in this case. At the operations, is planned for the future. present time, you can only use the orderBy command with changefeeds if you order on an index. To retrieve the current top five players and their scores, you Listing 5 can use the following ReQL expression: { r.db("test").table("scores").orderBy({index: r.desc("score")}).limit(3) new_val: null, old_val: { id: '362ae837-2e29-4695-adef-4fa415138f90', We can add the changes command to the end to get a stream name: 'Boromir', of updates. To get those updates to the frontend, we will use species: 'human' Socket.io, a framework for implementing realtime messaging } between server and client. It supports a number of transport } methods, including WebSockets. The specifics of Socket.io usage are beyond the scope of this article, but you can learn more about it by visiting the official Socket.io documentation. The code in Listing 7 uses sockets.emit to broadcast the Listing 6 updates from a changefeed to all connected Socket.io clients. r.db("test").tableCreate("players") On the frontend, you can use the Socket.io client library to r.table("players").indexCreate("score") set up a handler that receives the updateevent: r.table("players").insert([ {name: "Bill", score: 33}, var socket = io.connect(); {name: "Janet", score: 42}, {name: "Steve", score: 68} socket.on("update", function(data) { ... console.log("Update:", data); ]) });

That’s a good start, but we need a way to populate the initial Listing 7 list values when the user first loads the page. To that end, let’s extend the server so that it broadcasts the current leaderboard var sockio = require("socket.io"); over Socket.io when a user first connects (Listing 8). var app = require("express")(); The application uses the same underlying ReQL expression var r = require("rethinkdb"); in both cases, so we can store it in a variable for easy reuse. ReQL’s method chaining makes it highly conducive to that var io = sockio.listen(app.listen(8090), {log: false}); kind of composability. console.log("App listening on port 8090"); To wrap up the demo, let’s build a complete frontend. To keep things simple, I’m going to use Polymer’s data binding r.connect().then(function(conn) { system. Let’s start by defining the template: return r.table("scores").orderBy({index: r.desc("score")}) .limit(5).changes().run(conn); })

www.JAXenter.com | April 2015 18 Databases

It uses the repeat attribute to insert one li tag for each user. Now that the demo application is complete, you can test it The contents of the li tag display the user’s name and their by running queries that change the scores of your users. In the current score. Next, let’s write the JavaScript code (Listing 9). Data Explorer, you can try running something like: The handler for the leaders event simply takes the data from the server and assigns it to the template variable that r.table("scores").filter({name: "Bill"}) stores the users. The update handler is a bit more complex. .update({score: r.row("score").add(100)}) It finds the entry in the leaderboard that correlates with the old_val and then it replaces it with the new data. When you change the value of the user’s score, you will see When the score changes for a user that is already in the the leaderboard update to reflect the changes. leaderboard, it’s just going to replace the old record with a new one that has the updated number. In cases where a user Next steps in the leaderboard is displaced by one who wasn’t there previ- Conventional databases are largely designed around a query/ ously, it will replace one user’s record with that of another. response workflow that maps well to the web’s traditional The code in Listing 9 above will properly handle both cases. request/response model. But modern technologies like Web- Of course, the changefeed updates don’t help us maintain Sockets make it possible to build applications that stream up- the actual order of the users. To remedy that problem, we dates in realtime, without the latency or overhead of HTTP simply sort the user array after every update. Polymer’s data requests. binding system will ensure that the actual DOM representa- RethinkDB is the first open source database that is designed tion always reflects the desired order. specifically for the realtime web. Changefeeds offer a way to build queries that continuously push out live updates, obviat- ing the need for routine polling. Listing 8 To learn more about RethinkDB, check out the official var getLeaders = r.table("scores").orderBy({index: r.desc("score")}).limit(5); documentation. The introductory ten-minute guide is a good place to start. You can also check out some RethinkDB demo r.connect().then(function(conn) { applications, which are published with complete source code. return getLeaders.changes().run(conn); }) .then(function(cursor) { cursor.each(function(err, data) { io.sockets.emit("update", data); }); });

io.on("connection", function(socket) { r.connect().then(function(conn) { return getLeaders.run(conn) .finally(function() { conn.close(); }); }) .then(function(output) { socket.emit("leaders", output); }); });

Listing 9 var scores = document.querySelector("#scores"); var socket = io.connect();

socket.on("leaders", function(data) { scores.users = data; });

socket.on("update", function(data) { for (var i in scores.users) if (scores.users[i].id === data.old_val.id) { scores.users[i] = data.new_val; scores.users.sort(function(x,y) { return y.score - x.score }); break; Ryan Paul is a developer evangelist at RethinkDB. He is also a Linux en- } thusiast and open source software developer. He was previously a con- tributing editor at Ars Technica, where he wrote articles about software }); development.

www.JAXenter.com | April 2015 19 Web iStockphoto.com/enjoynz iStockphoto.com/enjoynz ©

The five biggest challenges in application performance management Performance fails and how to avoid them

What is good news for sales, can be bad news for IT. Sudden spikes in application usage need plenty of preparation, so before you unwittingly make any performance no-nos, here are the five areas where you might be slipping up.

by Klaus Enzenhofer With a view to the next stressful situations that will effect company applications, business- and IT-professionals are Whether it was on Black Friday, Cyber Monday or just dur- requested to evolve APM strategies to successfully navigate ing general Christmas shopping, this year’s holidays have multi-channels in a multi-connected world. The optimizing proven that too many online shops were far from well pre- of application performance to deliver high-quality, friction- pared for big traffic on their website or mobile offering. The less user experiences across all devices and all channels isn’t great impact of end-user experience is an underestimated easy, especially if you’re struggling with these heavy chal- aspect for the whole business. Application performance lenges: management (APM) has come a long way in a few short years, but despite the numerous solutions available in the 1. Sampling market, many businesses still struggle with fundamental Looking at an aggregate of what traffic analytics tell you problems. about daily, weekly and monthly visits isn’t enough. And

www.JAXenter.com | April 2015 20 Web

counting on a sampling of what users experience is also a 4. Third-parties – the unknown stars scary approach for sure. Having a partial view of what is hap- You are flying blind if you can’t cover the impact of inte- pening across your IT systems and applications is akin to try- grated third-party services and if you don’t have the control ing to drive a car when someone is blindfolding you. of their SLA compliance. Modern applications execute code Load Testing is essential and although it is an important on diverse edge devices, often calling elements from a vari- part of preparation for peak event times like Black Friday or ety of third-party services well beyond the view of traditional Christmas. Because it is no substitute for real user monitoring monitoring systems. Sure, third-party services can improve and to ensure a good customer journey for every visitor, a end-user experiences and deliver functionality faster than bundle of different methods is requested: Load testing, syn- standalone applications, but they have a dark side. They can thetic monitoring AND real user monitoring. Not only does increase complexity and page weights and decrease site per- this limit your understanding of what’s happening across the formance to actually compromise the end-user experience. app delivery chain, it leads to the next major scare that or- Not only that, when a third-party service goes down, ganizations face. whether it’s a Facebook “like” button, the “cart” in an online shop, ad or web analytics, IT is often faced with performance 2. Lessons learned about performance issues issues that’s not their fault, and not within their view. Trouble It’s Black Friday at 11 a. m., the phone rings and your boss is inevitable if they cannot explain the reason for a bad per- screams: “Is our site down? Why are transactions slowing to a formance or a crash on the website and frustration will effect crawl? The call center is getting overwhelmed with customer not only your end-users, but also your IT team. questions why they can’t check out online – Fix it!” This is the nightmare scenario that plays out too often, but it doesn’t 5. The Cloud – performance in the dark need to be that way. A global survey of 740 senior IT professionals found that For best results in performance and availability it’s a must nearly 80 percent of interviewed persons said that they fear have for continuously real user monitoring of all transac- cloud providers hide performance problems. Additionally, 63 tions, 24 hours, 7 days each week. Only this will ensure you percent of respondents indicated there was a need for more will see any and all issues as they come up, before customers meaningful and granular SLA metrics that are geared toward are involved. Only this gives to your the ability to respond ensuring the continuous delivery of a high quality end-user immediately and head off a heart-stopping call about issues experience. that should have been avoided. If your customers are your In preparing an upcoming major sales campaign you’ve “early warning system” they will be frustrated and likely done great work and you are confident, all your effort en- start venting on social media – which can be incredibly sures your websites resist the rush. But when the big day damaging your business' reputation. As a result frustrated comes, it turns out that the load testing you’ve done with customers will move to a competitor and revenue will be your CDN isn’t playing out the way it was predicted – be- lost. cause they are getting hit with peak demand that wasn’t reflected when they were in test mode. The inadequate track- 3. Problems identified, but no explanation ing and responding in real-time shows exactly how a lack of So you and your team can manage the first two challenges visibility effects and destroys any plan to make big money without having a lot of trouble. But now you have to face with the sales event. the next major hurdle. The Application Performance Moni- Whether you’ve launched a new app in a public cloud, toring shows you there’s a problem, but you can’t pinpoint or in your virtualized data center, full visibility across all the exact cause. Combing through waterfall charts and cloud and on premise tiers – in one pane of glass – is the logs – especially while racing against the clock to fix a prob- only way to maintain control. In this way, you’ll be able to lem – can feel like looking for needles in haystacks. You detect regressions automatically and identify root cause in can’t get any solution and the hurdle seems insurmountable. minutes. Reflect and consider these APM best practices in When every minute can mean tens of thousands of dollars your daily job. The next shopping season is coming sooner in lost revenue, the old adage “time is money” is likely to be than expected. ringing in your ears. But your IT doesn’t just need more data, it needs trans- parency from the end users into the data center, into the applications and deep to the level of the individual code line. It needs a look through a magnifying glass with a new generation APM solution. Today, synthetic monitoring em- powers businesses to detect, classify, identify and gather information on root causes of performance issues, instant triage, problem ranking and cause identification. “Smart analytics” reduces hours of manual troubleshooting to a Klaus Enzenhofer is a Technology Strategist and the Lead of the Dynatrace matter of seconds. Not all APM tools are covering a deep- Center of Excellence Team. dive analysis, so you need to test and check all your impor- tant needs.

www.JAXenter.com | April 2015 21 Web

Architectural differences in Vaadin web applications Separating UI structure and logic

In the first part of JAXenter’s series on Vaadin-based web apps, Matti Tahvonen shows us why every architec- tural decision has its pros and cons, and why the same goes for switching from Swing to Vaadin in your UI layer.

by Matti Tahvonen to just maintain one browser application. This is one of the fundamental reasons why pure web apps are now conquering As expressed by Pete Hunt (Facebook, React JS) at the even the most complex application domains. JavaOne 2014 Smackdown, if you were to create a UI toolkit from scratch, it would look nothing like Welcome to the wonderful world of web apps a DOM. Web technologies are not designed for application Even for experienced desktop developers it may be a huge development, but rich text presentation. Markup-based pres- jump from the desktop world to web development. Develop- entation has proven to be superior for more static content like ing web applications is much trickier than developing basic web sites, but applications are a different story. In the early desktop apps. There are lots of things that make things com- stages of graphical UIs on computers, UI frameworks didn’t plicated, such as client-server communication, the markup form “component based” libraries by accident. Those UI li- language and CSS used for display, new programming lan- braries have developed over decades, but the basic concept guages for the client side and client-server communication in of component based UI framework is still the most powerful many different forms (basic HTTP, Ajax style requests, long way to create applications. polling, WebSockets etc.). The fact is that, even with the most And yet Swing, SWT, Qt and similar desktop UI frame- modern web app frameworks, web development is not as easy works have one major problem compared to web apps: they as building desktop apps. require you to install special software on your client machine. Vaadin Framework is probably the closest thing to the As we have all learned during the internet era, this can be component based Swing UI development in the mainstream a big problem. Today’s users have lots of different kinds of web app world. Vaadin is a component based UI library that applications that they use and installing all of them (and es- tries to make web development as easy as traditional desk- pecially maintaining them) will become a burden for your IT top development, maximizing developers' productivity and department. the quality of the produced end user experience. In a Vaadin Browser plugins like Java’s Applet/Java WebStart support application the actual UI logic, written by you, lives in the (and Swing or JavaFX) and Flash are the traditional work­ server’s JVM. Instead of browser plugins, Vaadin has a built- arounds to avoid installing software locally for workstations. in “thin client” that renders the UI efficiently in browsers. The But famous security holes in these, especially with outdated highly optimized communication channel sends only the stuff software, may become a huge problem and your IT depart- that is really visible on the user’s screen to the client. Once the ment will nowadays most likely be against installing any kind initial rendering has been done, only deltas, in both ways, are of third party browser plugins. For them it is much easier transferred between the client and the server.

www.JAXenter.com | April 2015 22 Web

Architecture Memory and CPU usage is centralized to server The architecture of Vaadin Framework provides you with an On the negative side is the fact that some of the computing abstraction for the web development challenges, and most previously done by your user's workstation is now moved to of the time you can forget that you are building a web ap- the server. The CPU hit is typically negligible, but you might plication. Vaadin takes care of handling all the communica- face some memory constraints without taking this fact into tion, HTML markup, CSS and browser differences – you can account. On the other hand, the fact that the application concentrate all your energy on your domain problems with a memory and processing happens now mostly on the server, clean Java approach and take advantage of your experience might be a good thing. The server side approach makes it from the desktop applications. possible to handle really complex computing tasks, even with Vaadin uses GWT to implement its “thin client” running really modest handheld devices. This is naturally possible in the browser. GWT is another similar tool for web develop- with Swing and a central server as well, but with the Vaadin ment, and its heart is its Java to JavaScript “compiler”. GWT approach this comes as a free bonus feature. also has a Swing-like UI component library, but in GWT the A typical Vaadin business app consumes 50–500 kB of Java code is compiled into JavaScript and executed in the server memory per user, depending on your application char- browser. The compiler supports only a subset of Java and the acteristics. If you have a very small application you can do fact that it is not running in JVM causes some other limita- with a smaller number and if you reference a lot of data from tions, but the concepts are the same. Running your code in your UI, which usually makes things both faster and simpler, the browser as a white box also has some security implica- you might need even more memory per user. tions. The per user memory usage is in line with e. g. Java EE stand- ard JSF. If you do some basic math you can understand this One application instance, many users isn’t an issue for most typical applications and modern appli- The first thing you’ll notice is that you are now developing cation servers. But, in case you create an accidental memory your UI right next to your data. Pretty much all modern leak in application code or carelessly load the whole database business apps, both web and desktop apps, save their data table into memory, the memory consumption may become somehow to a central server. Often the data is “shielded” an issue earlier than with desktop applications. Accidentally a middleware layer (for example with EJBs). Now that you referencing a million basic database entities from a user ses- move to Vaadin UI, the EJB, or whatever the technology you sion will easily consume 100–200 MB of memory per session. use in your “backend”, is “closer”. It can often be run in the This might still be tolerable in desktop applications, but if you very same application server as your Vaadin UI, making some have several concurrent users, you’ll soon be in trouble. hard problems trivial. Using a local EJB is both efficient and The memory issues can usually be rather easily solved by secure. using paging or by lazy loading the data from the backend to Even if you’d still use a separate application server for your the UI. Server capacity is also really cheap nowadays, so buy- EJBs, they are most probably connected to UI servers using a ing a more efficient server or clustering your application to fast network that can handle chatty connection between UI multiple application servers is most likely much cheaper than and business layers more efficiently than typical client server making compromises in your architectural design. But in case communication – the network requirements by the Vaadin each of your application users need to do some heavy analysis thin client are in many cases less demanding, so your applica- with huge in-memory data sets, web applications are still not tion can be used over e. g. mobile networks. the way to go for your use case. Another thing developers arriving from desktop Java to If your application's memory usage is much more impor- Vaadin will soon notice is that fields with “static” keywords tant than its development cost (read: you are trying to write are quite different in the server world. Many desktop appli- the next GMail), Vaadin might not be the right tool for you. cations use static fields as “user global” variables. For Java If you still want to go to web applications, in this scenario apps running in server, they are “application global”, which you should strive for completely (server) stateless application is a big difference. Application servers generally use a class and keep your UI logic in browsers. GWT is a great library for loader per (.war file), not class loader per these kinds of applications. user session. For “user global” variables, use fields in your Additionally there are helpers to implement navigation be- UI class, Vaadin­Session, HttpSession or e. g. @SessionScoped tween views and the management of master-detail interfaces. CDI bean. One source of inspiration is Microsoft’s framework PRISM, Web applications in general will be much cheaper for IT de- an application framework that provides many needed tools partments to maintain. They have been traditionally run on a for the development of applications. company’s internal servers, but the trend of the era is hosting them in PaaS services, in the “cloud”. Instead of maintain- ing the application in each user’s workstation, updates and changes only need to be applied to the server. Also all data, not just the shared parts, is saved on the server whose back- Matti Tahvonen works at Vaadin in technical marketing, helping the com- ups are much easier to handle. When your user’s workstation munity be as productive as possible with Vaadin. breaks, you can just give him/her a replacement and the work can continue.

www.JAXenter.com | April 2015 23 Java

A look at mvvmFX Model View ViewModel with JavaFX

The mvvmFX framework provides tools to implement the Model View ViewModel design pattern with JavaFX. After one year of development a first stable 1.0.0 version has been released.

by Alexander Casall keep in mind that the JavaFX controller is part of the View and should not contain any logic. Its only purpose is to create The design pattern “Model View ViewModel” was first pub- the connection to the ViewModel (Listing 3). lished by Microsoft for .Net applications and is nowadays Please note that the View has a generic type that is the re- also used in other technologies like JavaScript frameworks. lated ViewModel type. This way mvvmFX can manage the As with other MV* approaches the goal is the separation be- lifecycle of the View and the ViewModel. tween the structure of the user interface and the (UI-) logic. To do this MVVM defines a ViewModel that represents the Additional Features state of the UI. The ViewModel doesn’t know the View and The shown example uses FXML to define the structure of the has no dependencies to specific UI components. user interface. This is the recommended way for development Instead the View contains the UI components but no UI but mvvmFX supports traditional Views written with pure logic and is connected with the ViewModel via Data Bind- Java code too. Another key aspect of the library is the sup- ing. Figure 1 shows a simple example of the preparation of a port of Dependency Injection frameworks. This is essential welcome message in the ViewModel. to be able to use the library in bigger projects. At the moment One of the benefits of this structure is that all UI state and there are additional modules provided for the integration UI logic is encapsulated in a ViewModel that is independent with Google Guice and JBoss Weld/CDI to allow for an easy from the UI. But what is UI logic? start with these frameworks. But other DI frameworks can be The UI logic defines how the user interface reacts to input easily embedded too. from the user or other events like changes in the domain mod- mvvmFX was recently released in a first stable version el. For example, the decision whether a button should be ac- 1.0.0. It is currently used for projects by worklplace Saxo- tive or inactive. Because of the independence from the UI, the nia Systems AG. The framework is developed as open source ViewModel can be tested with unit tests. In many cases there is no need for complicated integration tests anymore where the actual application is started and remotely controlled by Listing 1 the test tool. This simplifies test-driven development signifi- cantly. Due to the availability of Properties and Data Binding @Test JavaFX is eminently suitable for this design pattern. mvvmFX public void test(){ adds helpers and tools for the efficient and clean implementa- LoginViewModel viewModel = new LoginViewModel(); tion of the pattern. The following example will give an impression of the devel- assertThat(viewModel.isLoginButtonDisabled()).isFalse(); opment process with MVVM. In this example there is a login button that should only be clickable when the username and viewModel.setUsername("mustermann"); the password are entered. Following TDD, the first step is to assertThat(viewModel.isLoginButtonDisabled()).isFalse(); create a unit test for the ViewModel (Listing 1). After that the ViewModel can be implemented (Listing 2). viewModel.setPassword("geheim1234"); Now this ViewModel has to be connected with the View. assertThat(viewModel.isLoginPossible()).isTrue(); In the context of mvvmFX the “View” is the combination of } an fxml file and the related controller class. It is important to

www.JAXenter.com | April 2015 24 Java

Listing 2 public class LoginViewModel implements ViewModel {

private StringProperty username = new SimpleStringProperty(); private StringProperty password = new SimpleStringProperty(); private BooleanProperty loginPossible = new SimpleBooleanProperty();

public LoginViewModel() { loginButtonDisabled.bind(username.isEmpty().or(password. isEmpty()); }

// getter/setter }

Figure 1: Welcome message in ViewModel

Listing 3 public class LoginView implements FxmlView { @FXML // will be called by JavaFX as soon as the FXML bootstrapping is done public Button loginButton; public void initialize(){ username.textProperty() @FXML .bindBidirectional(viewModel.usernameProperty()); public TextField username; password.textProperty() .bindBidirectional(viewModel.passwordProperty()); @FXML public PasswordField password; loginButton.disableProperty() .bindBidirectional(viewModel.loginPossibleProperty()); @InjectViewModel // is provided by mvvmFX } private LoginViewModel viewModel; }

(Apache licence) and is hosted on GitHub. The authors wel- there are helpers to implement navigation between views and come feedback, suggestions and critical reviews. the management of master-detail interfaces. For the future development the focus lies on features that are needed for bigger projects with complex user interfaces. Alexander Casall is a developer at Saxonia Systems AG, with a focus on These include a mechanism with many ViewModels that can multi-touch applications using JavaFX. access common data without the introduction of a mutual visibility and dependency to each other (Scopes). Additionally

Imprint

Publisher Sales Clerk: Software & Support Media GmbH Anika Stock +49 (0) 69 630089-22 Editorial Offi ce Address [email protected] Software & Support Media Saarbrücker Straße 36 Entire contents copyright © 2015 Software & Support Media GmbH. All rights reserved. No 10405 Berlin, Germany part of this publication may be reproduced, redistributed, posted online, or reused by any www.jaxenter.com means in any form, including print, electronic, photocopy, internal network, Web or any other method, without prior written permission of Software & Support Media GmbH. Editor in Chief: Sebastian Meyen Editors: Coman Hamilton, Natali Vlatko The views expressed are solely those of the authors and do not refl ect the views or po- sition of their fi rm, any of their clients, or Publisher. Regarding the information, Publisher Authors: Chris Becker, Alexander Casall, Klaus Enzenhofer, Ghislain Mazars, disclaims all warranties as to the accuracy, completeness, or adequacy of any informa- Ryan Paul, Nathan Rijksen, Matti Tahvonen tion, and is not responsible for any errors, omissions, in adequacies, misuse, or the con- sequences of using any information provided by Pub lisher. Rights of disposal of rewarded Copy Editor: Jennifer Diener articles belong to Publisher. All mentioned trademarks and service marks are copyrighted Creative Director: Jens Mainz by their respective owners. Layout: Flora Feher, Christian Schirmer

www.JAXenter.com | April 2015 25