white paper

1

ENEA POLYHEDRA®: Polyhedra IMDB and High-Availability

Nigel Day Polyhedra Product Manager

Integrated data management services are an integral part of any advanced network equipment software platform. Where large application-level data storage is required, say for an HLR (Home Location Register) or accounting/billing system, the data management solution may come in the form of a traditional disk-based relational database management system (RDBMS).

Many network infrastructure systems, It is the complete system. So these There might also be safety issues. however, require a separate, higher terms, when applied to software, refer The data handling needs of HA em­- performance, in-memory database to the functions and features that the bedded systems often have the following solution to manage the dynamic control, software must supply in order to facili- requirements and charact­eristics: status and configuration information tate the design of an HA system. n Data changes must be fast, but required for continuous “five nines” Embedded systems requiring high quickly accessible. service delivery. availability - say ‘five 9s’ (99.999%) or n Data must be preserved in the face Enea Polyhedra provides a full- better of continuous operation – are of a complete system failure, and featured, fault-tolerant SQL RDBMS that typically configured with redundant must continue to be available in is optimized for network control plane cards, power supplies, etc. The individual the face of a partial failure. applications with demanding availability, components are not designed to be n The data structures, queries and performance and memory requirements. fault free. Rather, the system as a whole types of changes are relatively static Equally important, Polyhedra works is designed in a redundant fashion in a deployed systems, but will hand-in-glove with the Enea Accelerator that allows it to survive the failure of change frequently during application platform, a carrier class software plat- individual components. Redundancy, in development, and will usually form for building highly differentiated addition to improving availability, also change when deploying new network equipment that delivers high- enhances flexibility, enabling components versions of the application. quality multimedia services over IP to be swapped out and upgraded with­ n Field upgrades should not require networks with 100% service availability. out upsetting the running system. scheduled downtime, and the Featuring best-in-class networking, While live upgradeability is not period of vulnerability to component supervision, fault management, and needed in some HA systems (for exam­ failure should be minimized. database management, this flexible, ple, systems on board aircraft, which are n The system should minimize the reusable platform enables equipment turned off when the flight is over or the amount of application coding makers to build scalable, upgradeable, aircraft is serviced), it is crucial for appli- needed to handle the HA configur­ “five nines” equipment that greatly reduces cations like telecom where the systems ation. CAPEX and OPEX for service providers. must operate continuously for years on n The application should be scalable, end. For example, in telecom infrastruc- avoiding performance killers such High-Availability? ture equipment, a field upgrade should as polling. High-availability and fault tolerance not disrupt calls in progress. Similarly, are phrases that are often used rather in an industrial control systems, loosely, especially when it comes to stopping a production line for an software. In practice, it is not software upgrade could cost tens or hundreds of that is fault tolerant or highly available. thousands of dollars of lost production.

Enea is a global software and services company focused on solutions for communication-driven products. With 40 years of experience Enea is a world leader in the development of software platforms with extreme demands on high-availability and performance. Enea’s expertise in real-time operating systems and high availability shortens development cycles, brings down product costs and increases system reliability. Enea’s vertical solutions cover telecom handsets and infrastructure, medtech, industrial automation, automotive and mil/aero. Enea has 750 employees and is listed on Nasdaq OMX Nordic Exchange Stockholm AB. For more information please visit enea.com or contact us at [email protected]. www.enea.com white paper

2

Atomicity, Consistency, ACID compliance. Most are also SQL When only a few systems are being Isolation and Durability based, and support on-the-fly schema installed, a software-only solution is For transactional data stores, reference changes. However, they may lack suit­ possible, with a third machine used for is often made to the term ‘ACID’, a able HA characteristics, or be too slow arbitration. More complex configura- mnemonic for Atomicity, Consistency, for use in embedded systems. tions, however, usually require a two- Isolation and Durability. Atomicity means board solution, with a hardware mech­ that each transaction either fails (leaving Polyhedra IMDB and HA anism to assist in determining which the data in the pre-transactional state) The Polyhedra in-memory database sys- board should be master. Techniques or fully commits - no halfway houses. tem is designed for use in embedded vary, but the aim is to provide a reliable Consistency says that each trans- systems, where state and configuration way for one board to know whether it action leaves the data store in a clean information has to be kept readily acces- should be master, and to avoid having state, with all integrity conditions pre- ­sible, rapidly alterable… and safe. To both boards operate in master mode, served. If not, the transaction is aborted, ensure fast access, Polyhedra keeps the even when the data connection between leaving the data in its pre-transactional data in RAM, but backs it up using a va- the two is not working. state. Isolation says that transactions are riety of configurable mechanisms, inclu- Standard Polyhedra release kits independent, each completing before ding snapshots, transaction journals, and provide sample applications for both the next one is started (though judicious even hot standby where appropriate. two-board and three-board solutions. use of locking by the system can avoid To support fault-tolerant configura- For a 2-board solution, however, the the need for the implementation to be tions, Polyhedra provides a hot standby code would need to be tailored to quite so restrictive). mechanism for fault-tolerant pairs. On interact with the customer-supplied Durability means that if the data start up, the master server gives the hot hardware-based arbitrator. store says the transaction is complete, standby server a complete copy of the The Polyhedra client-server protocol then the data is in a ‘safe’ state and will database (including the schema and features a client-controlled heartbeat survive a subsequent system failure any CL code attached to the database). mechanism, which allows communi- (within the capabilities of the under- Once it is fully running, the master cation failures to be detected quickly lying hardware and environment). In sends copies of each transaction record if the underlying transport is unable to practice, the Durability requirement as soon as it commits to the transaction. provide timely information about the can significantly slow down a system This keeps the standby up to date so failure of the server. For example, when (flushing files to disk can take some that it can take over at a moment’s notice. TCP/IP is used, a process failure would time, for example), so most data stores The master-standby pair runs under normally be detected by the operating allow the level of durability to be tuned, the control of an external arbitration system, which would then close any balancing it against overall system re- mechanism, which controls when fail­ open ports. In the case of a complete sponsiveness when operating normally. over occurs, and decides which is the board failure, however, it is impractical All true relational database systems master when both come up together to wait for the TCP/IP stack at the other are transactional, and offer a degree of after a cold boot. end to detect the problem, as this could take up to a half an hour. The Polyhedra client libraries allow users to set up fault-tolerant connect­ ions to a list of servers. If the connection to the master server of a fault-tolerant pair fails (whether reported by the transport or by the heartbeat mechan­ ism described above), the library will automatically and repeatedly try all the servers in the list in turn. The delay between attempts is configurable, as is the maximum retry count. Once the connection is re-established to the current master, the library will re-establish all open active queries, and work out the ‘delta’ between the previous state and the current one. Consequently,­

Enea Polyhedra IMDB and HA. white paper

3

only minimal code changes are needed needs to be done is to update the soft- Thus, to upgrade the control cards in a to convert a Polyhedra client applica- ware on the management computers. rack with a new version of the Polyhedra tion to one that works with an HA data- This allows the clients to set the new server code, the user simply: base configuration. The change could configuration columns and monitor the n Tells the database server to produce be as simple as changing the code that new status columns. a snapshot to a named file. This opens the initial connection. If needed, The heterogeneity built into will cause the current state of the clients can monitor the changing status Polyhedra, together with its high level database to be recorded on both of the database connection and fine- of inter-version interoperability, means systems (in case of problems with tune the configuration parameters. that new line cards do not have to use the upgrade). In practice, this is Polyhedra allows client applications to the same operating system or proces- unlikely, but the belt-and-braces control transaction ‘durability’. Normally, sor type as the cards in the database approach should be ingrained in the server acknowledges a successful server. It doesn’t even have to use the those developing and installing HA transaction as soon as it has been fully same release version of the Polyhedra systems! committed in-store on the master. But software. n Stop the standby card, install the clients can opt to have the response More complex changes come new version of the Polyhedra server delayed until the transaction has been in two categories: changes to the software, and restart the card. The fully logged to disk (where enabled) application software on the control database server will be told it was and applied to (and acknowledged by) cards or line cards; and changes to the standby and connect to the master the standby (if it is running). underlying software used by the appli­ to obtain a new copy of the data- cation software. Let us consider these base. If the new software used a Field Upgrades two cases separately, starting with the different format for the saved data- In continuously-running systems, simplest one. base, it would automatically create there is often the need to change the a local copy in the new format. The software or data structures on the fly, Upgrading the Polyhedra standby would then be able to with no downtime. A simple case in Database Software receive transaction logs from the point would be the addition of a new From time to time, it may be necessary master, which would be applied to type of line card to a telecom rack in a to upgrade the Polyhedra code on a the local database to keep it in step basestation.­ The new card may be very system to a later version. For example, with the master. similar to an existing card, but need a new version of the application soft- n The standby card would now additional configuration information, ware may want to take advantage of a be promoted to master, causing or want to report additional status Polyhedra feature that was not present the server on the old master to information. The simplest way of hand- in the currently used version. Or, the relinquish control. The old master ling this would be for the software on new version of the application may can then be upgraded in turn, and the new line card to check the central incorporate a bug fix that was adversely started up (as a standby, naturally). database on startup. If the columns it affecting the application. wants are not present in the tables it In both cases, the inter-version If it is necessary to upgrade the client uses, it can just create them using the interoperability principles enforced by software on the line cards, this can be ‘add’ form of the ‘alter table command’. Polyhedra greatly simplify the upgrades done either before or after upgrading For example: process. These principles are: the server code, depending on which n Old (already-built) clients can is more convenient. In many cases, alter table linecard add connect to the new server. though, the client software will only (config2 integer default 49, status2 integer) n New clients can connect to older need upgrading in the rare event that servers and use the features they the application is affected by a bug in Another approach would be to just provide. For example, when using the Polyhedra client libraries. add the columns without checking first ODBC, client applications can There is no need for all the system if they exist. No harm will be done if interrogate the database to find components to use the same version they already exist, and the type can be out the release information and of Polyhedra. In fact, the client-server checked when performing queries. to determine which features are protocol is common to all members Once these changes have been implemented. of the Polyhedra DBMS family, so they made, the new line card can start its n New servers can read saved data- share the library code. Thus, there application as normal. Provided other base files written by the previous would be no need to upgrade the code clients have not used the ‘select *’ form release of Polyhedra. on the line cards when upgrading the of query when inspecting the tables, n New servers can act as standby to a their active queries and prepared master running the previous release statements will not be invalidated by of Polyhedra. this change, and they will see no inter- ruption in the database service. All that white paper

4

database on the control cards – say, the ground up to address the data cularly true for distributed applications from from Polyhedra32 to Polyhedra64. management needs of HA network that run on heterogeneous hardware applications, combining the benefits of and operating system platforms. Upgrading the standard SQL database technology with All instances of Polyhedra databases Application Code a powerful set of features that enhance can communicate transparently with If the new application code needs a reliability, performance, memory effici­ each other, regardless of version. This new version of Polyhedra, it may be ency, and data quality. These features not only simplifies the design of distri- simpler to do this first, as a separate make Polyhedra ideal for a broad range buted applications that require multiple stage, using the procedure outlined of applications, from fault-tolerant RDBMSs, but also provides a seamless above. Once the system is up and run- distributed infrastructure systems, to upgrade path that enables equipment ning the correct version of Polyhedra mobile devices with tight memory and makers to take advantage of the latest on both the master and standby, the power constraints. database technology without having database schema can be updated. The Enea recognizes the need for inte- to make substantial changes to their changes will automatically be applied grated data management services as existing application code. on both the master and the standby. part of an overall solution for advanced Data consistency and integrity can Polyhedra allows schema changes software platform development. That’s easily create problems over time if not to be grouped together into a single why Polyhedra works hand-in-glove addressed properly with a suitable data transaction using the ‘alter schema’ with Enea’s operating system and management solution. The days when command. The changes are checked middleware products. And that’s why a database was regarded as overhead for correctness before execution, and Polyhedra is an integral part of Enea’s for embedded systems are long gone. if there are any problems (such as an system-level Accelerator platforms. With Polyhedra’s unique combination incompatibility of column names, or The Polyhedra family provides an of reliability, tunable high performance lack of room for temporary structures optimized high-reliability SQL database and other advanced features, data when transforming the database), the solution for a broad range of embedded management need no longer be a database will revert to its earlier state. applications, whether the priority is maintenance burden. high availability or low memory. Using a Data Management, Part commercial, standards-based database of Enea’s Total Solution like Polyhedra simplifies modeling and Data management is a key factor in design, and requires fewer specialists, the design of any embedded network all of which speed development and system. Polyhedra was developed from reduce development cost. This is parti-

Enea®, Enea OSE®, Netbricks®, Polyhedra® and Zealcore® are registered trademarks of Enea AB and its subsidiaries. Enea OSE®ck, Enea OSE® Epsilon, Enea® Element, Enea® Optima, Enea® Optima Log Analyzer, Enea® Black Box Recorder, Enea® LINX, Enea® Accelerator, Polyhedra® Flashlite, Enea“ dSPEED Platform, Enea® System Manager, Accelerating Network Convergence™, Device Software Optimized™ and Embedded for Leaders™ are unregistered trademarks of Enea AB or its subsidiaries. Any other company, product or service names mentioned above are the registered or unregistered trademarks of their respective owner. WP48 012009. © Enea AB 2009.