Bank of America

Quartz WorldMap Infrastructure Monitoring

A Major Qualifying Project

submitted to the Faculty

of the

WORCESTER POLYTECHNIC INSTITUTE

in partial fulfillment of the requirements for the

Degree of Bachelor of Science

by

Jun Liang

Zhaokun Xue

October 14, 2013

Approved

Professor Micha Hofri

Major Advisor

1

Abstract

Bank of America’s Quartz platform, which is used for developing all internal applications by the company, is made up of over 600 application and database servers deployed in over 25 data centers globally. There are many variables that can contribute to issues in the environment for example

CPU/memory consumption, disk space, network latency, replication latency, etc. Our team was responsible for building a web based interactive world map, named “Quartz WorldMap Infrastructure

Monitoring”, which could graphically show the environment and alert when key performance thresholds are breached.

In this project, our team kept focused on building a reliable back-end server and designing a user- friendly front-end client. Our team built two back-end services to poll data from databases in real time, and used modern jQuery open source libraries to construct functionalities and user interface of the front- end. We proposed and implemented the well-tested two-way data binding method to automatically synchronize data between databases and front-end display.

2

Acknowledgments

From Bank of America, we would like to acknowledge the guidance of Brian Petix, Marquis

Rahman and Michael Covington. From Worcester Polytechnic Institute, we would like to thank our advisors Arthur Gerstenfeld and Micha Hofri for their help throughout the process of the project.

3

Executive Summary

After the merger with Merrill Lynch, Bank of America has become the largest financial institution in the world. With a global workforce in more than 40 countries, Bank of America is all about providing people, companies and institutional investors the financial products and services they need to help achieve their goals at every stage of their financial lives. Our team worked on the bank’s Global

Markets Technology Team which provides end-to-end technology solutions and operations support for the Global Markets businesses. In addition, the team is also responsible for establishing an Architecture and Strategy framework for consistency across the Global Markets platforms. After years working of the team, Bank of America Merrill Lynch Global Markets built up the next generation, cross-asset technology platform, Quartz.

The Quartz platform is made up of over 600 application and database servers deployed in over 25 data centers globally. There are many variables that can contribute to issues in the environment for example CPU/memory consumption, disk space, network latency, replication latency, etc. In order to keep tracking these variables, we are responsible for building a web based interactive world map that would graphically show the environment and alert when key performance thresholds are breached. After interviewing with the bank’s Quartz Develop Team members, Marquis Rahman in Chicago and Michael

Covington in Houston, specifically, we have decided on the objective of building a web based monitoring application to provide the ability for users tracking, searching and making comments on alerts reported by each data center. So far, on the Quartz Platform, web based monitoring applications just start at the very beginning. This project could help the Quartz Develop Team to fill this gap on the Quartz

Platform. In order to achieve this objective, we divided the application into two parts, back-end programming and front-end design.

For the application’s back-end server (more details in section 3.1.1 and section 4.1.1), we wrote all the code in Python on the company’s Quartz platform.

4

First, to get alerts reported by 25 data centers, our team built a service called “Valkyrie Alert

Service” which is used to retrieve alerts from Valkyrie Server by calling the Valkyrie API. Valkyrie

Server is an alert message center that collects all the issues from Bank of America internal services. In order to get the latest alerts in real-time, our team set the “Valkyrie Alert Service” call the Valkyrie API every 5 seconds. After polling data from Valkyrie, the service publishes data into the AMPS server

“alerts_in_valkyrie” topic.

Second, we do not only just want get details of alerts, but we also want to combine more information of the corresponding data center like, line of business (LOB), region, organization, etc.

Therefore, our team built the “Alert Engine and Decorator”. This service polls additional information of data centers from the Hardware Diagram Caches, then combines them with new alerts from the

“alerts_in_valkyrie” topic, and publishes alerts to the “reported_alerts” topic.

As for the front-end part (more details in section 3.1.2 and 4.1.2), we built it in JavaScript, HTML code and pattern.

First, for the major part--the map, after online search and testing the source given by our managers, we recommended to use an external open source called JVectorMap JQuery Plugin which not only supports all the features required for our project, but is easy to implement. On the map, we use 25 markers to represent the locations of the 25 data centers globally. Red markers stand for data centers having alerts reported; green markers stand for data centers without alerts. Hovering with a mouse over markers, a label above the marker will show the data center’s name and the number of alerts that have been reported in this data center so far. To synchronize the number of alerts in the data center in real time, we recommended using JSON and Two-Way Data Binding methods. By using JSON and Two-Way Data

Binding, we could just update part of the page without refreshing the whole page.

Second, clicking on one of the red markers, a grid will show the details of alerts below the map.

For the gird, we recommended building it on the PQgrid JQuery open library. For each of the alerts shown in the grid, users have the ability to disable it for a given period of time. Users also have the choices to view all the alerts collected in the database and all the disabled alerts. On the disabled grid,

5 users can reactivate disabled alerts. Disabled alerts could also be automatically reactivated after the disabling period expires

For the page layout (more details in section 4.2), above the map, we display a set of world clocks and user login information. On the right side of the map, a scrolling panel shows the latest five PAPA tickets. The PAPA ticket is another error tracking tool used by the bank. Below the map, we use a scrolling bar to show all the news about alerts in real time. Below the news bar, a filter panel provides the feature to do multi-selection search for alerts based on specific properties of data centers and alerts. In addition, user could also do the historical search by giving a period of time.

Our creation of a new alerts monitoring tool was aimed to give the Quartz Develop Team at Bank of America the ability to track and make comments on alerts reported by each data center. In this way, the

Quartz Develop Team was given the means by which they can keep records of alerts reported by each data center.

6

Table of Contents Abstract ...... 2 Acknowledgments ...... 3 Executive Summary ...... 4 Chapter 1 Introduction ...... 10 Chapter 2 Background ...... 11 2.1 Sponsor Description ...... 11 2.1.1 Global Research of Bank of America Merrill Lynch ...... 11 2.1.2 Global Markets Technology Team in Bank of America ...... 12 2.2 Quartz Development Platform ...... 12 2.2.1 Introduction to Quartz Platform ...... 12 2.2.2 Sandra and AMPS ...... 13 2.2.3 QZDevelop ...... 13 2.2.4 QSP – Web Apps on Quartz ...... 13 2.3 Programming Languages ...... 14 2.3.1 Python ...... 14 2.3.2 JavaScript ...... 14 2.4 Dependency Libraries ...... 14 2.4.1 JQuery ...... 14 2.4.2 jQuery UI ...... 15 2.4.3 Bootstrap ...... 15 2.4.4 Chosen Multiple Selection ...... 15 2.4.5 jVectorMap ...... 15 2.4.6 pqGrid ...... 16 2.4.7 jClock ...... 16 2.5 Data Interchange Format ...... 16 2.5.1 JSON ...... 16 2.6 Others ...... 17 2.6.1 Flask ...... 17 2.6.2 AJAX ...... 17 Chapter 3 Methodology ...... 18 3.1 Application Structure ...... 18

7

3.1.1 Back-End Server ...... 19 3.1.2 Front-End Client ...... 19 3.2 Data-binding in Web Client ...... 23 3.3 Synchronized with Server ...... 25 3.4 CSS Style Design ...... 25 Chapter 4 Design and Implementation ...... 26 4.1 Architecture Design ...... 26 4.1.1 Back-end Server Design...... 26 4.1.2 Front-end Client Design ...... 28 4.2 User Interface Design ...... 29 4.2.1 Front Page Layout ...... 29 4.2.2 Grid ...... 31 4.2.3 Filter Panel ...... 34 Chapter 5 Conclusions ...... 36 Chapter 6 Future Recommendations ...... 38 6.1 Data Analysis in Data Centers ...... 38 6.2 Data Analysis on Servers ...... 39 6.3 Alert Correlation and Prediction ...... 40 Reference: ...... 41 Appendix ...... 43 Schedule ...... 43

Table of Figures Figure 1: Application Construction Design at the Beginning ...... 18 Figure 2: Design of View Modules ...... 22 Figure 3: Two-way Data Binding ...... 23 Figure 4: Values in View and Local Data are in Binding ...... 24 Figure 5: Data in Client is Sync with Server ...... 25 Figure 6: Architecture Design for the application ...... 26 Figure 7: details for alert_in_valkyrie topic in AMPS ...... 27 Figure 8: illustrating how to get reported_alerts topic ...... 28 Figure 9: details of disabled_alerts topic in AMPS ...... 29

8

Figure 10: details of news topic in AMPS ...... 29 Figure 11: front page layout ...... 30 Figure 12: show the result of mouse over markers ...... 31 Figure 13: results of clicking a marker ...... 32 Figure 14: results by clicking "All Messages" option ...... 32 Figure 15: pop-up window for "Disable" action ...... 33 Figure 16: results of clicking "All Disabled Messages" option ...... 33 Figure 17: filtering panel layout...... 34 Figure 18: historical search panel layout ...... 34 Figure 19: example output for filtering and historical search ...... 35 Figure 20: example of 2013 year report of NJ2 Data Center ...... 39 Figure 21: example of Feb. report for NYS Data Center Server 1...... 40

9

Chapter 1 Introduction

Bank of America Merrill Lynch is one of the world’s largest financial institutions, serving individual consumers, small-and middle-market businesses and large corporations with a full range of banking, investing, asset management and other financial and risk management products and services. [2]

Their Global Research department is providing institutional clients worldwide with access to our leading sales and trading and research franchises, investment banking services, global client relationships and product innovation. It was ranked number one by “Institutional Investor” in 2011. [16]

Bank of America has infrastructure, Quartz, for a strategic initiative at the bank that is made up of over 600 application and development servers deployed in over 25 data centers around the world. There are many variables that can contribute to issues in the environment for example CPU/ memory consumption, disk space, network latency, replication latency, etc.

We propose to use Python, JavaScript and HTML to build a web-based interactive world map that would graphically show the environment and alert when key performance indicators are breached in real- time. Users can see the real-time performance evaluation and detailed information by clicking items.

So far, on Bank of America’s Quartz platform, web based monitoring applications are at their very beginning. This project gives us a great opportunity to help the company fill this gap, and also gives us the chance to do more practice on web application development, and learns more about real-time data processing; real-time is one of the most active web technologies these years.

10

Chapter 2 Background

2.1 Sponsor Description

Bank of America was established in 1904, when Bank of Italy was founded in San Francisco.

After years’ development, it becomes the largest commercial bank in the United States in terms of deposits and market capitalization. The company’s purpose is to make the financial lives of those who do business with them better. [2]

After completing the acquisition of Merrill Lynch & Co, which is one of the world’s leading wealth management, capital markets and advisory companies, on 1 January 2009, Bank of America has become the largest brokerage in the world, with more than 15,000 Financial Advisors and approximately

$2.2 trillion in client assets. Bank of America began doing all its corporate and investment banking activities under the Bank of America Merrill Lynch name in September 2009. [3] [14] [15]

2.1.1 Global Research of Bank of America Merrill Lynch

Bank of America Merrill Lynch Global Research, as one of the most respected research organizations in the world, offers clients access to the highest quality investment ideas, with more than

740 analysts in more than 20 countries in 2012. Analysts offer timely trading strategies, market insight and execution across all asset classes, including equities, fixed income, rates, currencies and commodities.

They focus on six primary disciplines:

● Global Equities

● Global Equity Strategy

● Global Credit Research and Global Escalation Management (GEM) Fixed Income Strategy

● Global Economics

● Global Commodities and Asset Allocation

● Global Rates and Currencies

11

Bank of America Merrill Lynch is positioned to strengthen relationships with their client across their

Global Banking and Markets divisions. They are helping clients do business in emerging markets like

China, Brazil and Russia, which are transforming into growth markets and represent some of the greatest opportunities going forward. [16]

2.1.2 Global Markets Technology Team in Bank of America

Global Markets Technology (GMT) provides end-to-end technology solutions and operations support for the Global Markets businesses including Equity, Electronic Trading, Rates & Currencies,

Credit & Structured Products, Commodities, Research, Sales and Capital Markets. In addition, the group is responsible for establishing an Architecture and Strategy framework for consistency across the Global

Markets platforms. [17]

Bank of America's technology couples technological expertise with the bank's strategic vision to keep Bank of America at the forefront of the world's financial markets. Delivering market-leading technology drives our business forward.

2.2 Quartz Development Platform

2.2.1 Introduction to Quartz Platform

Quartz is an internal cloud application environment for Bank of America Merrill Lynch, implemented in Python 2.6. Quartz provides development tools to employees for building applications

(Python) based on this platform. All data on Quartz platform could be shared to others applications

(optional); all source code of the applications are stored in Quartz.

Quartz is the reliable platform to increase collaboration in Bank of America through using open source technologies while adapting quickly to changing environment to deliver robust applications to the company’s end-users that solve their needs.

12

Quartz offers many components to drive that vision and the line of business that build upon the common framework provide constantly feedback to improve the company’s shared platform.

2.2.2 Sandra and AMPS

Sandra is a globally replicated object store, implemented in C++. Sandra objects are persistent; they are inserted into Sandra ‘folder’ for later retrieval. “Indexes” can be used to allow rapid access to objects.

The AMPS platform is a message server, which uses publish/subscribe pattern (Topic-based). All developers in Bank of America have access to use and create topic on AMPS.

2.2.3 QZDevelop

QZDevelop is used to develop and configure all components of the Quartz platform. Its main function is a Python Integrated Development Environment built by the Quartz Dev Team. Our team wrote, shared, tested and published all of our code on it. Whenever developers want to publish their code, a review should be requested on one or more of the committed files. An agile process in place ensures team coverage of code release.

2.2.4 QSP – Web Apps on Quartz

QSP provides web service to let developers build web front-end on Quartz platform. QSP supports Python CGI and Flask framework. Python CGI is a support module for CGI (Common Gateway

Interface) scripts; Flask is a simple web application framework written in Python and using Jinja2 template engine. Jinja2 is a web page template engine that let developers easy to manage web views.

13

2.3 Programming Languages

2.3.1 Python

Python is an interpreted, interactive, object-oriented programming language that is used in a wide variety of application domains. Python runs on Windows, Linux/Unix, and Mac OS X and has been ported to the and .NET virtual machines. Python also provides a large number of GUI frameworks from Tkinter (traditionally bundled with Python, using Tk) to a number of other cross-platform solutions.

The major cross-platform technologies that Python frameworks based on include Gtk, Qt, Tk and wxWidgets. [1]

2.3.2 JavaScript

JavaScript (JS) is an interpreted computer programming language. Implementations of JavaScript in web browsers allow client-side scripts to interact with the user, control the browser, communicate asynchronously, and alter the document content that is displayed. It has also become common in server- side programming, game development and the creation of desktop applications. Its syntax was influenced by C. JavaScript copies many names and naming conventions from Java, but the two languages are otherwise unrelated and have very different semantics. The key design principles within JavaScript are taken from the Self and Scheme programming languages. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles. [5]

2.4 Dependency Libraries

2.4.1 JQuery

JQuery is a multi-browser (cross-browser) JavaScript library designed to simplify the client-side scripting of HTML. It was released in 2006 at BarCamp NYC by John Resig. It is currently developed by

14 a team of developers led by Dave Methvin. Used by over 65% of the 10,000 most visited websites, jQuery is the most popular JavaScript library in use today.

JQuery also provides capabilities for developers to create plug-ins on top of the JavaScript library.

This enables developers to create abstractions for low-level interaction and animation, advanced effects and high-level, theme-able widgets. The modular approach to the jQuery library allows the creation of powerful dynamic web pages and web applications. [7]

2.4.2 jQuery UI

jQuery UI is a JavaScript library that provides abstractions for low-level interaction and animation, advanced effects and high-level, themeable widgets, built on top of the jQuery JavaScript library. We use it to generate pop-up dialog.

2.4.3 Bootstrap

Bootstrap is a free front-end template framework for creating websites and web applications. It contains HTML and CSS-based design templates for typography, forms, buttons, navigation and other interface components, as well as optional JavaScript extensions.

2.4.4 Chosen Multiple Selection

Chosen is a jQuery plugin that makes long, unwieldy select boxes much more user-friendly. We apply Chosen in the alert filter.

2.4.5 jVectorMap

jVectorMap is a jQuery plugin that generates interactive vector maps on the web pages. It is a full-featured map plugin, it provides custom marker and this is very important to our design.

15

2.4.6 pqGrid pqGrid is open source jQuery grid plugin for displaying and manipulating tabular data in rich

Ajax applications. We use it to show alert details.

2.4.7 jClock It came from a JavaScript tutorial site. It is a free jQuery plugin to generate clock on the web page.

2.5 Data Interchange Format

2.5.1 JSON

JavaScript Object Notation (JSON) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. JSON is a text format that is completely language independent but uses conventions that are familiar to of the C-family of languages, including C/C++, Java, JavaScript, Python, and many others. These properties make JSON an ideal data- interchange language.

JSON is built on two structures:

1. A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.

2. An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

These are universal data structures. Virtually all modern programming languages support them in one form or another. It makes sense that a data format that is interchangeable with programming languages also be based on these structures. [6]

16

2.6 Others

2.6.1 Flask

Flask is a lightweight for Python based on two external libraries, the Jinja2 template engine and Werkzeug WSGI toolkit. Flask protects users against one of the most common security problems of modern web applications: cross-site scripting (XSS).

The syntax is very clean and straightforward. It’s easy to learn and simple to use, very friendly to

Python beginners, enabling you to build your web app in a short amount of time.

2.6.2 AJAX

AJAX is a technique for creating fast and dynamic web pages. AJAX allows web pages to be updated asynchronously by exchanging small amounts of data with the server behind the scenes. This means that it is possible to update parts of a web page, without reloading the whole page. Classic web pages, (which do not use AJAX) must reload the entire page if the content should change.

17

Chapter 3 Methodology

3.1 Application Structure

Our application is a Single Page Application (SPA). SPA is a web application (or web site) that fits on one single web page with the goal of providing a more fluid user experience akin to a desktop application. Such applications move some functions from server to web client, such as view generator.

The most prominent technique being used is AJAX.Predominantly using the XMLHttpRequest object from JavaScript. Popular libraries like and jQuery, that normalize AJAX behavior across browsers from different manufacturers, have further popularized the AJAX technique. [19][23]

We need to build a web server (Back-end) to process the requests from web client. Server connects the AMPS server and web clients. All the features in the top are coded in client; back-end only supports data.

Figure 1: Application Construction Design at the Beginning

18

3.1.1 Back-End Server

The QSP (internal web server) in Bank of America Merrill Lynch does support Flask web framework. (Descried in Section 2.6.1) Flask is a micro-framework for Python based on Werkzeug (a

Web Server Gateway Interface utility library for Python), Jinja 2 (a website template framework for

Python) and good intentions. [18]

In the back-end, the main function of the web server is doing the data transmission between the web front-end and the AMPS server.

Bank of America has an internal service for collecting alerts from all data center, and it is called

Valkyrie. We built a bridge to connect Valkyrie: it subscribes to Valkyrie and publishes to

“alerts_in_valkyrie” topic on our AMPS server. “alerts_in_valkyrie” stores alerts from Valkyrie but the information is not complete. Alert from Valkyrie does give the detailed information about the problematic server; it provides the server name only. We built the Decorator to take the server name and call other API to decorate message with additional fields like LOB, Region, Organization, etc. The

Decorator stores the complete alerts into “reported_alerts” topic. These alerts are going to be published to the web client.

3.1.2 Front-End Client

Front-end client is an Ajax-Based web application. Ajax (Asynchronous JavaScript and XML) provides the technology that allows a website or web-based application to communicate with the server without having to refresh the entire page. Technically, the asynchronous features provide the means for the client browser to send requests or call methods that are executed on the server side. The result from the server can then be processed on the client side using JavaScript code, and any output can be merged into the existing front-end HTML view without having to refresh the page. When you use Ajax, you're not really using a new programming language. In fact, all you are doing is taking advantage of existing technologies and putting them to better use.

19

When combined, Flask (Python) and Ajax provide a powerful platform for creating Web sites or

Web-based applications with robust features.

The key to Client/Server communication in Ajax is to use the XMLHttpRequest object in

JavaScript. It provides an easy way to retrieve data from a server call without having to do a full-page refresh. Therefore, a Web page can update just a part of the page without disrupting what the user is doing.

In order to simplify the code, we decided to use jQuery library. Most browsers, such as Microsoft Internet

Explorer, FireFox and Google Chrome, support the XMLHttpRequest object.

About the difference between traditional client/server communication and Ajax-based client/server communication: Traditionally, for the client browser to send content to the server for processing or storing in a database, you usually use a POST action to send content from input fields collected on the client side to the server. The server processes this content using Flask (Python) or any scripting language of your choice, reads or stores data using a database, and returns the results embedded within HTML code. The HTML is then processed by the browser and a new page is rendered for the end user to view. [17]

We use Model-View-Controller (MVC) patterm as the main construct in front-end.

3.1.2.1 Models (Data Structure Definitions in Web Client)

1. ‘Alert’ (exactly same as the data properties from Valkyrie) has the following attributes:

a. Lob

b. Region

c. Datacenter

. Last_update_time

e. Created_time

f. Organization

g. Threshold

h. Component_type

20

i. Component_name

j. Metric_value

k. Metric_name

l. Alert_message

m. Latest_status

n. Alert_status

2. ‘Data center’ has the following attributes:

a. Name: data center name

b. Longitude and Latitude: location of the data center

c. Style: data center marker’s color

d. Alerts: list of alert IDs.

3. ‘News’ has the following attributes:

a. Title: news title

b. Time: post time of the news

4. ‘PAPA Tickets’ has the following attributes:

a. Content: ticket content

b. Time: post time of the ticket

3.1.2.2 Views (User Interface of All Features in Web Client)

1. World Clock

a. 9 clocks display different time zones

b. Cities/Countries: New York, New Jersey, Houston, Chicago, London, India, Hong Kong,

Singapore and Tokyo.

2. World Map

a. Vector map generated by jVectorMap (open source)

b. Show markers on the map. Marker is representative of data center.

21

c. Red marker means there is alert(s) in that data center.

d. Green marker means there is no alert in that data center.

3. Toolbar

a. Alert Filter

i. Selections: Component type, Data Center, LOB and Region. Four drop-down lists in

total.

ii. Multiple selections are available.

b. History

i. Selections: Start date and End date.

ii. Let users to search previous alerts in given time interval.

c. Disabled Alerts

i. Display all the disabled alerts.

ii. User is able to re-activate any of them.

4. Data Grid

a. Using PQGrid (open source) to rendering the grid.

b. Binding with ‘Alert’ model.

c. Show alert information in detail.

5. News

a. News scrolling at bottom of screen.

b. There is a button to add news.

Figure 2: Design of View Modules

22

3.2 Data-binding in Web Client

A binding creates a link between two properties such that when one changes, the other one is updated to the new value automatically. Bindings can connect properties on the same object, or across two different objects. A one-way data binding only propagates changes in one direction. Two-way data binding refers to the ability to bind changes to an object’s properties to changes and vice versa. [20]

Figure 3: Two-way Data Binding

We apply two-way data binding in our application. It is the automatic synchronization of data between the model and view components. The view is a projection of the model at all times. When the model changes, the view will reflect the change, and vice versa. [21]

Because the view is just a projection of the model, the controller is completely separated from the view and unaware of it. This makes testing much easier. This is easy to test your controller in isolation without the view and the related DOM/browser dependency.

23

Figure 4: Values in View and Local Data are in Binding

In our application, the information in view comes from ‘News’, ‘Alert’ and ‘Data Center’ data collections (as we mentioned in Model Definition).

Each marker in view is bound to datacenter data collection. They are synchronized; markers in view would automatically be updated when data has change. We apply one-way data binding on map markers, so the user can only read data. The user can send request to the controller for filtering the markers.

All the alerts data are in one JavaScript array. Each of them has a unique index; each datacenter has a list of alert indexes. Alert part is two-way data binding, so user can read and modify alert data in view, and user’s request action will be logged. We only allow user disable alerts for now. Disabled alerts will not show on the normal view; user can read the comments and re-activate them.

24

3.3 Synchronized with Server

Figure 5: Data in Client is Sync with Server

On server side, each kind of data is using a form of the JSON object representing the model data on the server. It's a straightforward serialization of a row from AMPS server. [24]

One client side, it calls server every 3 seconds for polling the JSON data by using (such as alerts)

Ajax. If the data has change on server database, it will also update the data in client local. If user has any change in client local, each of these actions will be sent to server and update the given row in AMPS server.

3.4 CSS Style Design

We use Bootstrap library as our CSS template. Bootstrap is a free website style template.

Originally created by a designer and a developer at Twitter, Bootstrap has become one of the most popular front-end frameworks and open source projects in the world.

25

Chapter 4 Design and Implementation

4.1 Architecture Design

Back-End

Front-End

Figure 6: Architecture Design for the application

Before we started working on the coding part, we made a detailed architecture design. We listed all databases topics and services we need for our project, and we also showed the relationships between them. The above figure shows the detailed architecture design for our web application. The part above the black line is the back-end server design and the part below it is the front-end client design.

4.1.1 Back-end Server Design

We wrote all of our back-end server code in Python on Quartz Platform. The back-end server is made up by two main services with several database topics in AMPS. The main job for the back-end server is to poll alerts message data from the Valkyrie API and other QZ tables, then stores them into

26 corresponding topics in the AMPS database, and publishes required data fields to front-end client. The most important part of the back-end server is the two services.

1. Valkyrie Alert Service

We built this service to call the Valkyrie API every 5 seconds to get the latest alert

messages for 25 data centers, and then publish the data into the “alerts_in_valkyrie” topic with

these fields: component_type, component_name, latest_status, metric_value, metric_name,

created_time and last_update_time.

Figure 7: details for alert_in_valkyrie topic in AMPS

2. Alert Engine and Decorator

Once we get data from Valkyrie, we want to decorate these alert data with their

corresponding data centers’ hardware summary information. We use the “Alert Engine and Decorator”

service to combine these two pieces of data and store them into the “reported_alerts” topic. In order

to get the additional fields, this service need to do queries for the “Hardware Diagram Cache” QZ

table which is used to store hardware information for data centers. QZ tables are just similar to tables

we use in SQL which could store data with specific fields. Finally, we pass these alerts information

stored in the “reported_alert” topic to corresponding data centers (markers) on the front-end world

map.

27

Figure 8: illustrating how to get reported_alerts topic

4.1.2 Front-end Client Design

As for the front-end client, we implemented it by using HTML and JavaScript code. The fornt- end client collects latest alerts data from the “reported_alerts” topic, and it interacts with another two topics in back-end, “disabled_alerts” and “news”. Besides showing alerts data on the map or on the grid, the front-end client provides two actions, “Disable Alerts” and “Add News”, for users.

1. Disable Alerts

On the front-end client grid view, users could disable a specific alert by clicking on the

“Disable” button at the end of each alert. After users do the “Disable” action, the front-end client

sends disabled messages back to the back-end server by wrapping them into JSON objects, and

publishes them into “disabled_alerts” topic. At the same time, we change those corresponding

alerts’ statuses to “DISABLED” in the “reported_alerts” topic. For all disabled alerts, users are

asked to set a disabled period for them. The default start-time is the current time and the default

end-time is 15 minus from now on. When this time period expires, that alert will be automatically

reactivated.

28

Figure 9: details of disabled_alerts topic in AMPS

2. Add News

Similarly to the “Disable” action, users can add news to alerts, and the front-end client

will publish those data into the “news” topic in the AMPS database; however, this action is a two-

way action. Once the front-end client writes news into the “news” topic, the front-end client will

make another action to subscribe data from the “news” topic at the same time and display news

information on the news scrolling bar.

Figure 10: details of news topic in AMPS

4.2 User Interface Design

4.2.1 Front Page Layout

For the application’s user interface design, we use the Bootstrap CSS template to do the whole page layout. We have four main containers on the page. The top container displays the user login information, the title of our application, and a set of world clocks. The middle container is made up by the

29 world map, and the scrolling panel for PAPA tickets. For the bottom two containers, one is for the news scrolling bar, and the other is for the filter panel and the grid. On the map, red markers stand for data centers have alerts; green markers are data centers without alerts.

Figure 11: front page layout

30

When users mouse over a marker on the map, a label will show users the data center’s name and the number of alerts reported in this data center right above the marker.

Figure 12: show the result of mouse over markers

In order to keep the front page as clean as possible, we hide the filter and grid panel, when users first time on the page. Only when users click a red marker, the grid with detailed information about alerts in that data center will be shown under the map. When users click “Filtering”, the filter panel will slide down.

4.2.2 Grid

We use PQGrid JQuery open library to implement the grid view for alerts’ details. By clicking one of the red markers on the map, users could view details of alerts for that data center in a grid. For every activated alert in the gird, users have the ability to disable that alert by clicking on the “Disable” action button. After clicking “Disable” button, users need to set a period of disabled time for that alerts in a pop up window. During that period of time, that alert’s status will be changed to “DISABLED” and

31 stored in “disabled_alerts” topic in AMPS. If it expires, the alert will be automatically reactivated.

Figure 13: results of clicking a marker

Above the grid, users have four choices: “Remove all filtering”, “All Messages”, “All Disabled

Messages” and “Hide Messages”.

1. “Removing all filtering” action removes all the filter selections and shows all messages. On

the other hand, “All Messages” option displays all activated alerts stored in the

“reported_alerts” topic.

Figure 14: results by clicking "All Messages" option

32

Figure 15: pop-up window for "Disable" action

2. “All Disabled Messages” shows all alerts in disabled_alerts topic.

Figure 16: results of clicking "All Disabled Messages" option

In this grid, users can manually reactivate disabled alerts by clicking on the “Reactivate”

action button. After clicking the “Reactivate” button, the status of that alert will be changed

to “ENABLED” in the “reported_alerts” topic and re-shown in that data center. We would

not remove reactivated alerts from the “disabled_alerts” topic, because we want to give

users the ability to track the disabled alerts history feature in the future.

33

3. “Hide Messages” would hide the whole message container.

4.2.3 Filter Panel

We put the filter button right below the news scrolling bar. When users click the “Filtering” button, the filter panel will slide down. Users could do filter search based on alerts’ “Component Type” and “Data Center” properties and data centers’ “LOB” and “Region” properties. We use “Chosen” JQuery

Plugin to implement the multi-selection feature for each option.

Figure 17: filtering panel layout

The filer panel also contains a “History” button which allows users to do historical searches.

Click on the “History” button, two fields, “Start Time” and “End Time” will show up. Pick a period of time; users could see all alerts reported during that period of time.

Figure 18: historical search panel layout

Not only the filtered and historical searches’ results will be shown in the grid, but the results will also be reflected on the map at the same time as well.

34

Figure 19: example output for filtering and historical search

35

Chapter 5 Conclusions

Our project offers a better way for users to track alerts reported in the 25 data centers globally. In addition, our project also contributes to help the QZ Dev team to add to the few web based monitoring applications on Quartz Platform. Since the major purpose of our project is to show the latest and correct data for users in real time, we spent the first two weeks on building and testing the back-end server to guarantee that our back-end server could continuously and reliably poll data from datacenters and display them to users in real time. For the front-end client, the map view and the grid view of data give users more flexibility to analyze details of the data. Since our project is a web based interactive application, we need to emphasize user-friendly design principles such as well-formatted content requirements, fast load times, browser consistency, and easy-to-use functions and so on. It took us about three weeks to discuss solutions for Internet Explorer compatibility issues, redesign the page layout and demo the new achievements for other QZ Dev team members. We had to make sure they would feel comfortable with the format design and find it easy to use all the functionalities on our applications.

This project also gave us the opportunity to learn a couple of major lessons on websites design.

The most important one is that not all browsers have the same performance for the same third-party libraries. Particularly, the Internet Explorer 9 (IE 9) has some severe compatibility issues for some modern JQuery open sources; at the beginning of our work, we were not aware of this issue, and we just tested its performance on the Firefox browser. However, since more than half of the employees in the bank just have IE 9 browser on their computers, it was necessary for us to guarantee the application could also perform well on IE 9. That was the biggest challenge for our project in the last two weeks. When we moved to IE 9, we were shocked that so many problems were raised. For example no hands were on the clocks, the news scrolling bar did not move anymore, and what’s worse, the map was not displayed either.

These issues required us to carefully read documents of all open libraries we used in our project and to find a way to deal with the compatibility issues on IE 9. After doing research of JQuery compatibility

36 issues for IE 9, we found the easiest common solution for all open libraries by introducing the “” tag for IE 9. This experience gave us a lesson that whenever we work on some web applications we need to care about the browser consistency principles from the start.

Overall our project was successfully at realizing its goals. Its back-end server is robust for keeping polling data from Valkyrie API and displaying them on the front-end client’s map and grid views.

All functional features are easy-to-use by users. In addition, we kept all of our code meeting the requirements of coding standard principles, Google JavaScript coding style, and well documented. The bank’s Quality assurance group could find our code easy to understand and programmers can build more functionality upon our project for future use.

37

Chapter 6 Future Recommendations

Currently comparing with developments of other applications on the Quartz Platform, web based monitoring applications just start at the very beginning. The QZ Dev team is trying to help on building the web based monitoring applications, and our team wants our project contributes more to this process. So far, our project can provide basic features of a monitoring such as collecting, tracking and displaying data.

So based on the functionalities of our application, we think our project could have more data analysis features in the future.

6.1 Data Analysis in Data Centers

For now, when users mouse over a marker on the map, it will show a label with information of the data center’s name and the number of alerts have reported in this data center. We could extend more details information of datacenters on their labels. Every data center has around 60 servers, and these servers are divided into different families based on their functionalities. When users mouse over a marker, the label could show the hardware summary for that data center. For example when users mouse over

“One Bryant Park DC” data center, the label will show the following information:

One Bryant Park DC 30 Sandra Servers 40 Bob Servers 2000 Hugs Cores

Furthermore, we could make our application have more data analysis features. Basically we could generate different kinds of reports for datacenters. For instance, we can do annually reports for data centers. We can use line charts to show the total number of alerts reported every month in a given year.

This feature provides managers a way to view the trend of reported alerts during given years. Here is an example of 2013 report for NJ2 data center.

38

NJ2 Data Center, 2013 Report 30 20 10

0

Jul.

Jan.

Jun.

Oct.

Apr.

Feb. Sep.

May

Dev.

Aug. Nov. Mar.

Figure 20: example of 2013 year report of NJ2 Data Center

6.2 Data Analysis on Servers

From the company’s internal databases, we can get lots of details information about each server in data centers; however, so far, we just use these detailed properties for filtering alerts. In order to make this information have a better use, we could generate monthly reports for servers. We can use bar charts to show different kinds of alerts reported by each server in any given data center. By doing this, engineers could have a way to track frequencies of different kinds of alerts reported in the given server, and they can keep focus and pay more attention on the kind of alerts reported most frequently. Therefore, they could pay more attention on this kind of metric value for that server. Here is an example of February report for NYS data center Server 1.

39

8 NYS Server -1 Feb. Report 6 4 2 0 CPU Memory Diskspace Networks

Figure 21: example of Feb. report for NYS Data Center Server 1

6.3 Alert Correlation and Prediction

Based on these alerts data, we can try to build an algorithm for predicting risks. Supervised learning alerts to predict the percentages of some issues.

We observed some hardware issues might cause some software issues. For instance, there is one server in NJ2 data center downs in some reason. After a couple of minutes, we receive an alert message about server timeout call. Obviously, we guess some service was disconnected with the server just downed. We think that every alert might able to give us some useful information, which can predict new issues. That was a simple case. Some cases of prediction may be more complicated. Data mining is able to find these out.

40

Reference:

1. "Educational Materials." Python Introduction. Google, 13 Dec. 2012. Web. 13 Jan. 2014. 2. “Business Information Materials." Bank of America Investor Relations Overview. Bank of America, Web. 13 Jan. 2014. 3. "Bank of America Completes Merrill Lynch Purchase." Bank of America Completes Merrill Lynch Purchase. Reuters, 1 Jan. 2009. Web. 13 Jan. 2014. 4. Recordon, David. ": Facebook's Real-Time Web Framework for Python- Facebook Developers." Tornado: Facebook's Real-Time Web Framework for Python- Facebook Developers. Facebook, 10 Sept. 2009. Web. 13 Jan. 2014. 5. Crockford, Douglas. "JavaScript:The World's Most Misunderstood Programming Language." JavaScript: The World's Most Misunderstood Programming Language. CrockFord, 2001. Web. 13 Jan. 2014. 6. "JSON." Mozilla Developer Network. Mozilla Developer Network, 10 Sept. 2013. Web. 13 Jan. 2014. 7. "What Is JQuery?" JQuery JQuery, n.d. Web. 13 Jan. 2014. 8. "Unittest — Unit Testing Framework." Unit Testing Framework. Python, n.d. Web. 13 Jan. 2014. 9. "APE Ajax Push Engine." APE (Ajax Push Engine). APE Project, n.d. Web. 13 Jan. 2014. 10. Davis, Thomas. "Why Do You Need Backbone.js?" Why Would You Use Backbone.js? Backbone Tutorials, n.d. Web. 13 Jan. 2014. 11. Roach, Christopher. "Node.js Step by Step: Introduction." Node.js Step by Step. NetTuts, 8 Apr. 2011. Web. 13 Jan. 2014. 12. Chandler, Jane. "Introduction to Object-Oriented Databases." Object-Oriented Databases. Object-Oriented Databases Management Systems, 13 Sept. 1998. Web. 13 Feb. 2013. 13. "Pair Programming." Agile Development Methods. VersionOne, n.d. Web. 13 Jan. 2014. 14. Shen, Linda. "Bank of America Rebrands Offices in New York, London Towers." Bloomberg News. Bloomberg, 3 Dec. 2009. Web. 12 Jan. 2014. 15. "About Merrill Lynch." About Merrill Lynch. Bank of America Merrill Lynch, n.d. Web. 13 Jan. 2014. 16. "Insights, Ideas and Extensive Services." About Bank of America, Institutional Investors. N.p., n.d. Web. 13 Jan. 2014. 17. "Global Markets Technology (GMT) – Bank of America – Americas." Bank of America Careers. N.p., n.d. Web. 15 Jan. 2014. 18. Ramirez, Ken. "Build Ajax-based Web Sites with PHP." Build Ajax-based Web Sites with PHP. IBM DeveloperWorks, 2 Sept. 2008. Web. 12 Jan. 2014. 19. Grinberg, Miguel. "The Flask Mega-Tutorial, Part XV: Ajax." - Miguelgrinberg.com. Miguelgrinberg.com, 20 Feb. 2013. Web. 12 Jan. 2014. 20. Galli, Marcio, Roger Soares, and Ian Oeschger. "Inner-browsing Extending the Browser Navigation Paradigm." Mozilla Developer Network. Mozilla Developer Network, 16 May 2003. Web. 12 Jan. 2014.

41

21. Ongaro, Luca. "Easy Two-Way Data Binding in JavaScript." Easy Two-Way Data Binding in JavaScript -. Lucaongaro.eu, 2 Dec. 2012. Web. 12 Jan. 2014. 22. Henderson, Andrew. "Two-Way Data Bindings in Backbone." Andrew Henderson RSS. Andrewhenderson.me, 8 Jan. 2013. Web. 12 Jan. 2014. 23. Millard, Peter, Peter Saint-Andre, and Ralph Meijer. "XEP-0060: Publish-Subscribe." XEP-0060: Publish- Subscribe. XMPP Organization, 12 July 2012. Web. 12 Jan. 2014. 24. Jína, Vojta. "Quick Introduction to AngularJS." Quick Introduction to AngularJS. Google, 23 Sept. 2011. Web. 12 Jan. 2014. 25. Hagemeister, Philipp. "Storing and Loading Data with JSON." Bite Sized Python Tips. Bite Sized Python Tips, 8 Aug. 2013. Web. 12 Jan. 2014.

42

Appendix

Schedule Oct.28 Computer Set-up, Orientation and project description Oct.29 Oct.30 Setting up new topics in AMPS; Creating Valkyrie alert service and Oct.31 metric data service, then integrate them with AMPS messaging system Nov. 1 Nov. 2 Nov. 3 Using SOW subscribe data from Alert_In_Valkyrie to 'Alert Engine Nov. 4 and Decorator' and combining data from Hardware Diagrams Cache, then write data into Reported_Alerts topic in AMPS Nov. 5 Integration testing, creating web server for subscribing data from Nov. 6 Reported_Alerts topic in AMPS (including features of edit alerts, such as active, disable), publish disabled alerts to Disabled_Alerts topic; Nov. 7 publish/subscribe news to/from NEWS topic.

Nov. 8 Nov. 9 Nov. 10 Nov. 11 Veteran Day Nov. 12 Start to building the web front-end: 1.world clocks Nov. 13 2. Creating the JS components such as error message, data edit panel. Nov. 14 3. News section Nov. 15 4. Map section and filters. Nov. 16 Nov. 17 Nov. 18 Continue the Map section and filters. Nov. 19 Add viewing historical alerts ability for each data center. Implement Nov. 20 viewing ahistorical point in time to see what alerting feature for each center. Nov. 21 Nov. 22 Improve front-end UI design Nov. 23 Nov. 24 Nov. 25 Work on feature auto creating PAPA ticket when acknowledging alert Nov. 26 Nov. 27 Thanksgiving

43

Nov. 28 Nov. 29 Nov. 30 Dec. 1 Dec. 2 Add new features based on new requirements if needed Dec. 3 Dec. 4 Dec. 5 Dec. 6 Dec. 7 Dec. 8 Dec. 9 Complete all the rest code. Ready for publish into production. Dec. 10 Dec. 11 Dec. 12 Dec. 13 Dec. 14 Dec. 15 Dec. 16 Optimize and document our code; Keep testing on it Dec. 17 Dec. 18 Dec. 19 Dec. 20 Dec. 21 Dec. 22

44