Carrier Grade Server Features in the Kernel Towards Linux-based Telecom Plarforms

Ibrahim Haddad Ericsson Research [email protected]

Abstract 1 Open platforms

The demand for rich media and enhanced communication services is rapidly leading to significant changes in the communication in- dustry, such as the convergence of data and Traditionally, communications and data ser- voice technologies. The transition to packet- vice networks were built on proprietary plat- based, converged, multi-service IP networks forms that had to meet very specific availabil- require a carrier grade infrastructure based on ity, reliability, performance, and service re- interoperable hardware and software building sponse time requirements. Today, communica- blocks, management middleware, and appli- tion service providers are challenged to meet cations, implemented with standard interfaces. their needs cost-effectively for new architec- The communication industry is witnessing a tures, new services, and increased bandwidth, technology trend moving away from propri- with highly available, scalable, secure, and etary systems toward open and standardized reliable systems that have predictable perfor- systems that are built using modular and flex- mance and that are easy to maintain and up- ible hardware and software (operating system grade. This paper presents the technological and middleware) common off the shelf com- trend of migrating from proprietary to open ponents. The trend is to proceed forward de- platforms based on software and hardware livering next generation and multimedia com- building blocks. It also focuses on the ongo- munication services, using open standard car- ing work by the Carrier Grade Linux working rier grade platforms. This trend is motivated group at the Open Source Development Labs, by the expectations that open platforms are go- examines the CGL architecture, the require- ing to reduce the cost and risk of developing ments from the latest specification release, and and delivering rich communications services. presents some of the needed kernel features Also, they will enable faster time to market and that are not currently supported by Linux such ensure portability and interoperability between as a Linux cluster communication mechanism, various components from different providers. a low-level kernel mechanism for improved re- One frequently asked question is: ’How can we liability and soft-realtime performance, sup- meet tomorrow’s requirements using existing port for multi-FIB, and support for additional infrastructures and technologies?’. Proprietary security mechanisms. platforms are closed systems, expensive to de- velop, and often lack support of the current and upcoming standards. Using such closed platforms to meet tomorrow’s requirements for new architectures and services is almost impos- sible. A uniform open software environment with the characteristics demanded by telecom applications, combined with commercial off- the-shelf software and hardware components is a necessary part of these new architectures. Figure 1: From Proprietary to Open Solutions The following key industry consortia are defin- ing hardware and software high availability specifications that are directly related to tele- com platforms: 2 The term Carrier Grade

In this paper, we refer to the term Carrier Grade on many occasions. Carrier grade is a term 1. The PCI Industrial Computer Manufactur- for public network telecommunications prod- ers Group [1] (PICMG) defines standards ucts that require a reliability percentage up to 5 for high availability (HA) hardware. or 6 nines of uptime.

2. The Open Source Development Labs [2] • 5 nines refers to 99.999% of uptime per (OSDL) Carrier Grade Linux [3] (CGL) year (i.e. 5 minutes of downtime per working group was established in Jan- year). This level of availability is usually uary 2002 with the goal of enhancing the associated with Carrier Grade servers. Linux operating system, to achieve an Open Source platform that is highly avail- • 6 nines refers to 99.9999% of uptime per able, secure, scalable and easily main- year (i.e. 30 seconds of downtime per tained, suitable for carrier grade systems. year). This level of availability is usually associated with Carrier Grade switches.

3. The Service Availability Forum [4] (SA Forum) defines the interfaces of HA mid- dleware and focusing on APIs for hard- 3 Linux versus proprietary operat- ware platform management and for appli- ing systems cation failover in the application API. SA compliant middleware will provide ser- vices to an application that needs to be HA This section describes briefly the motivating in a portable way. reasons in favor of using Linux on Carrier Grade systems, versus continuing with propri- etary operating systems. These motivations in- clude: The operating system is a core component in such architectures. In the remaining of this pa- • Cost: Linux is available free of charge in per, we will be focusing on CGL, its architec- the form of a downloadable package from ture and specifications. the Internet. • Source code availability: With Linux, you high performance networking, and the proven gain full access to the source code allow- record of being a stable, and reliable server ing you to tailor the kernel to your needs. platform.

• Open development process (Figure 2): The development process of the kernel is open to anyone to participate and con- 4 Carrier Grade Linux tribute. The process is based on the con- cept of "release early, release often." The is missing several features • Peer review and testing resources: With that are needed in a telecom environment. It access to the source code, people using a is not adapted to meet telecom requirements wide variety of platform, operating sys- in various areas such as reliability, security, tems, and compiler combinations; can and scalability. To help the advancement of compile, link, and run the code on their Linux in the telecom space, OSDL established systems to test for portability, compatibil- the CGL working group. The group specifies ity and bugs. and helps implement an Open Source platform targeted for the communication industry that • Vendor independent: With Linux, you no is highly available, secure, scalable and easily longer have to be locked into a specific maintained. The CGL working group is com- vendor. Linux is supported on multiple posed of several members from network equip- platforms. ment providers, system integrators, platform providers, and Linux distributors. They all • High innovation rate: New features are contribute to the requirement definition of Car- usually implemented on Linux before they rier Grade Linux, help Open Source projects are available on commercial or propri- to meet these requirements, and in some cases etary systems. start new Open Source projects. Many of the CGL members companies have contributed pieces of technologies to Open Source in order to make the Linux Kernel a more viable option for telecom platforms. For instance, the Open Systems Lab [5] from Ericsson Research has contributed three key technologies: the Trans- parent IPC [6], the Asynchronous Event Mech- anism [7], and the Distributed Security Infras- tructure [8]. There are already Linux distri- butions, MontaVista [9] for instance, that are providing CGL distribution based on the CGL requirement definition. Many companies are Figure 2: Open development process of the also either deploying CGL, or at least evaluat- Linux kernel ing and experimenting with it.

Consequently, CGL activities are giving much Other contributing factors include Linux’ sup- momentum for Linux in the telecom space port for a broad range of processors and allowing it to be a viable option to propri- peripherals, commercial support availability, etary operating system. Member companies of CGL are releasing code to Open Source and of simultaneous connections. A signaling are making some of their proprietary technolo- server application is context switch and gies open, which leads to going forward from memory intensive due to requirements for closed platforms to open platforms that use quick switching and a capacity to manage CGL Linux. large numbers of connections.

• Management servers handle traditional network management operations, as well 5 Target CGL applications as service and customer management. These servers provide services such as: a Home Location Register and Visitor Lo- The CGL Working Group has identified three cation Register (for wireless networks) main categories of application areas into which or customer information (such as per- they expect the majority of applications imple- sonal preferences including features the mented on CGL platforms to fall. These appli- customer is authorized to use). Typi- cation areas are gateways, signaling, and man- cally, management applications are data agement servers. and communication intensive. Their re- sponse time requirements are less strin- • Gateways are bridges between two dif- gent by several orders of magnitude, com- ferent technologies or administration do- pared to those of signaling and gateway mains. For example, a media gateway per- applications. forms the critical function of converting voice messages from a native telecommu- nications time-division-multiplexed net- work, to an Internet protocol packet- 6 Overview of the CGL working switched network. A gateway processes a group large number of small messages received and transmitted over a large number of physical interfaces. Gateways perform The CGL working group has the vision that in a timely manner very close to hard next-generation and multimedia communica- real-time. They are implemented on ded- tion services can be delivered using Linux icated platforms with replicated (rather based open standards platforms for carrier than clustered) systems used for redun- grade infrastructure equipment. To achieve this dancy. vision, the working group has setup a strat- egy to define the requirements and architecture • Signaling servers handle call control, ses- for the Carrier Grade Linux platform, develop sion control, and radio recourse control. a roadmap for the platform, and promote the A signaling server handles the routing and development of a stable platform upon which maintains the status of calls over the net- commercial components and services can be work. It takes the request of user agents deployed. who want to connect to other user agents and routes it to the appropriate signaling. In the course of achieving this strategy, the Signaling servers require soft real time re- OSDL CGL working group, is creating the re- sponse capabilities less than 80 millisec- quirement definitions, and identifying existing onds, and may manage tens of thousands Open Source projects that support the roadmap to implement the required components and in- idation suites. It is responsible for co- terfaces of the platform. When an Open Source ordinating the development of validation project does not exist to support a certain re- suites, to ensure that all of the CGL re- quirement, OSDL CGL is launching (or sup- quirements are covered. This group is port the launch of) new Open Source projects also responsible for the development of to implement missing components and inter- an Open Source project CGL validation faces of the platform. suite.

The CGL working group consists of three dis- tinct sub-groups that work together. These sub- groups are: specification, proof-of-concept, 7 CGL architecture and validation. Responsibilities of each sub- group are as follows: Figure 3 presents the scope of the CGL Work- 1. Specifications: The specifications sub- ing Group, which covers two areas: group is responsible for defining a set of requirements that lead to enhancements in the Linux kernel, that are useful for car- rier grade implementations and applica- tions. The group collects, categorizes, and prioritizes the requirements from partici- pants to allow reasonable work to proceed on implementations. The group also in- teracts with other standard defining bod- ies, open source communities, develop- ers and distributions to ensure that the re- quirements identify useful enhancements in such a way, that they can be adopted into the base Linux kernel. Figure 3: CGL architecture and scope 2. Proof-of-Concept: This sub-group gener- ates documents covering the design, fea- tures, and technology relevant to CGL. It drives the implementation and integration • Carrier Grade Linux: Various require- of core Carrier Grade enhancements to ments such as availability and scalability Linux as identified and prioritized by the are related to the CGL enhancements to requirement document. The group is also the operating system. Enhancements may responsible for ensuring the integrated en- also be made to hardware interfaces, inter- hancements pass, the CGL validation test faces to the user level or application code suite and for establishing and leading an and interfaces to development and debug- open source umbrella project to coordi- ging tools. In some cases, to access the nate implementation and integration ac- kernel services, user level library changes tivities for CGL enhancements. will be needed. 3. Validation: This sub-group defines stan- • Software Development Tools: These tools dard test environments for developing val- will include debuggers and analyzers. On October 9, 2003, OSDL announced 8.3 Standards the availability of the OSDL Carrier Grade Linux Requirements Definition, Version 2.0 (CGL 2.0). This latest re- CGL specifies standards that are required for quirement definition for next-generation compliance for carrier grade server systems. carrier grade Linux offers major advances Examples of these standards include: in security, high availability, and cluster- ing. •

• POSIX Timer Interface

8 CGL requirements • POSIX Signal Interface

• POSIX Message Queue Interface The requirement definition document of CGL version 2.0 introduced new and enhanced fea- • POSIX Semaphore Interface tures to support Linux as a carrier grade plat- form. The CGL requirement definition divides • IPv6 RFCs compliance the requirements in main categories described • IPsecv6 RFCs compliance briefly below: • MIPv6 RFCs compliance

8.1 Clustering • SNMP support

• POSIX threads These requirements support the use of multi- ple carrier server systems to provide higher lev- els of service availability through redundant re- 8.4 Platform sources and recovery capabilities, and to pro- vide a horizontally scaled environment sup- OSDL CGL specifies requirements that sup- porting increased throughput. port interactions with the hardware platforms making up carrier server systems. Platform ca- 8.2 Security pabilities are not tied to a particular vendor’s implementation. Examples of the platform re- quirements include: The security requirements are aimed at main- taining a certain level of security while not en- • Hot insert: supports hot-swap insertion of dangering the goals of high availability, perfor- hardware components mance, and scalability. The requirements sup- port the use of additional security mechanisms • Hot remove: supports hot-swap removal to protect the systems against attacks from both of hardware components the Internet and intranets, and provide special mechanisms at kernel level to be used by tele- • Remote boot support: supports remote com applications. booting functionality • Boot cycle detection: supports detecting 8.6 Serviceability reboot cycles due to recurring failures. If the system experiences a problem that The serviceability requirements support servic- causes it to reboot repeatedly, the system ing and managing hardware and software on will go offline. This is to prevent addi- carrier server systems. These are wide-ranging tional difficulties from occurring as a re- set requirements, put together, help support the sult of the repeated reboots availability of applications and the operating • Diskless systems: Provide support for system. Examples of these requirements in- diskless systems loading their ker- clude: nel/application over the network • Support for producing and storing kernel • Support remote booting across common dumps LAN and WAN communication media • Support for dynamic debug to allow dy- namically the insertion of software instru- 8.5 Availability mentation into a running system in the kernel or applications

The availability requirements support height- • Support for platform signal handler en- ened availability of carrier server systems, such abling infrastructures to allow interrupts as improving the robustness of software com- generated by hardware errors to be logged ponents or by supporting recovery from failure using the event logging mechanism of hardware or software. Examples of these re- • Support for remote access to event log in- quirements include: formation

• RAID 1: support for RAID 1 offers mir- 8.7 Performance roring to provide duplicate sets of all data on separate hard disks OSDL CGL specifies the requirements that • Watchdog timer interface: support for support performance levels necessary for the watchdog timers to perform certain speci- environments expected to be encountered by fied operations when timeouts occur carrier server systems. Examples of these re- quirements include: • Support for Disk and volume manage- ment: to allow grouping of disks into vol- • Support for application (pre) loading. umes • Support for soft real time performance • Ethernet link aggregation and link through configuring the scheduler to pro- failover: support bonding of multiple NIC vide soft real time support with latency of for bandwidth aggregation and provide 10 ms. automatic failover of IP addresses from • Support Kernel preemption. one interface to another • Raid 0 support: RAID Level 0 pro- • Support for application heartbeat moni- vides "disk striping" support to enhance tor: monitor applications availability and performance for request-rate-intensive or functionality. transfer-rate-intensive environments 8.8 Scalability industry, to help adopt Linux on their carrier grade platforms, and support telecom applica- tions. These enhancements (Figure 4) fall into These requirements support vertical and hori- the following categories availability, security, zontal scaling of carrier server systems such as serviceability, performance, scalability, relia- the addition of hardware resources to result in bility, standards, and clustering. acceptable increases in capacity.

8.9 Tools

The tools requirements provide capabilities to facilitate diagnosis. Examples of these require- ments include:

• Support the usage of a kernel debugger.

• Support for Kernel dump analysis. Figure 4: CGL enhancements areas • Support for debugging multi-threaded programs The implementations providing theses en- hancements are Open Source projects and planned for integration with the Linux ker- 9 CGL 3.0 nel when the implementations are mature, and ready for merging with the kernel code. In some cases, bringing some projects into matu- The work on the next version of the OSDL rity levels takes a considerable amount of time CGL requirements, version 3.0, started in Jan- before being able to request its integration into uary 2004 with focus on advanced require- the Linux kernel. Nevertheless, some of the en- ment areas such as manageability, serviceabil- hancements are targeted for inclusion in kernel ity, tools, security, standards, performance, version 2.7. Other enhancements will follow in hardware, clustering and availability. With the later kernel releases. Meanwhile, all enhance- success of CGL’s first two requirement docu- ments, in the form of packages, kernel modules ments, OSDL CGL working group anticipates and patches, are available from their respective that their third version will be quite beneficial project web sites. The CGL 2.0 requirements to the Carrier Grade ecosystem. Official re- are in-line with the Linux development com- lease of the CGL requirement document Ver- munity. The purpose of this project is to form a sion 3.0 is expected in October 2004. catalyst to capture common requirements from end-users for a CGL distribution. With a com- mon set of requirements from the major Net- 10 CGL implementations work Equipment Providers, developers can be much more productive and efficient within de- velopment projects. Many individuals within There are several enhancements to the Linux the CGL initiative are also active participants Kernel that are required by the communication and contributors in the Open Source develop- ment community. protocol. This leverages the particular condi- tions present within loosely coupled clusters. It runs on Linux and is provided as a portable source code package implementing a loadable 11 Examples of needed features in kernel module. the Linux Kernel TIPC is unique because there seems to be no other protocol providing a comparable com- In this section, we provide some examples bination of versatility and performance. It of missing features and mechanisms from the includes some original innovations such as Linux kernel that are necessary in a telecom the functional addressing, the topology sub- environment. scription services, and the reactive connec- tion concept. Other important TIPC fea- tures include full location transparency, sup- 11.1 Transparent Inter-Process and Inter- port for lightweight connections, reliable mul- Processor Communication Protocol for ticast, signaling link protocol, topology sub- Linux Clusters scription services and more.

TIPC should be regarded as a useful toolbox Today’s telecommunication environments are for anyone wanting to develop or use Carrier increasingly adopting clustered servers to gain Grade or Highly Available Linux clusters. It benefits in performance, availability, and scal- provides the necessary infrastructure for clus- ability. The resulting benefits of a cluster ter, network and software management func- are greater or more cost-efficient than what a tionality, as well as a good support for de- single server can provide. Furthermore, the signing site-independent, scalable, distributed, telecommunications industry interest in clus- high-availability and high-performance appli- tering originates from the fact that clusters cations. address carrier grade characteristics such as guaranteed service availability, reliability and It is also worthwhile to mention that the scaled performance, using cost-effective hard- ForCES (Forwarding and Control Element ware and software. Without being absolute WG) [11] working group within IETF has about these requirements, they can be divided agreed that their router internal protocol (the in these three categories: short failure detection ForCES protocol) must be possible to carry and failure recovery, guaranteed availability of over different types of transport protocols. service, and short response times. The most There is consensus on that TCP is the pro- widely adopted clustering technique is use of tocol to be used when ForCES messages are multiple interconnected loosely coupled nodes transported over the Internet, while TIPC is to create a single highly available system. the protocol to be used in closed environments (LANs), where special characteristics such as One missing feature from the Linux kernel in high performance and multicast support is de- this area is a reliable, efficient, and transpar- sirable. Other protocols may also be added as ent inter-process and inter-processor commu- options. nication protocol. Transparent Inter Process Communication (TIPC) [6] is a suitable Open TIPC is a contribution from Ericsson [5] to the Source implementation that fills this gap and Open Source community. TIPC is licensed un- provides an efficient cluster communication der a dual GPL and BSD license, and will be announced to LKML in May 2004. the network/FIB 10, and the othe r HTTP server will serves the network/FIB 20. The ad- 11.2 IPv4, IPv6, MIPv6 forwarding tables fast vantage gained is to have one Linux box serv- access and compact memory with multi- ing two different customers usi ng the same IP ple FIB support address. ISPs adopt this approach by provid- ing services for multiple customers sharing the same server (server pa rtitioning), instead of Routers are core elements of modern telecom using a server per customer. networks. They propagate and direct billion of data packets from their source to their des- The way to achieve this is to have an ID (an tination using air transport devices or through identifier that identifies the customer or user of high-speed links. They must operate as fast as the service) to completely separ ate the rout- the medium in order to deliver the best qual- ing table in memory. Two approaches exist: ity of service and have a negligible effect on the first is to have a separate routing tables, communications. To give some figures, it is each routing table is looked up by their ID and common for routers to manage between 10.000 within tha t table the lookup is done one the to 500.000 routes. In these situations, good prefix. The second approach is to have one ta- performance is achievable by handling around ble, and the lookup is done on the combined 2000 routes/sec. The actual implementation of key = prefix + ID. the IP stack in Linux works fine for home or small business routers. However, with the high A different kind of problem arises when we are expectation of telecom operators and the new not able to predict access time, with the chain- capabilities of telecom hardware, it appears as ing in the hash table of the routi ng cache (and barely possible to use Linux as an efficient FIB). This problem is of particular inter-est in forwarding and routing element of a high-end an environment that requires predictable per- router for large network (core/border/access formance. router) or a high-end server with routing capa- bilities. Another aspect of the problem is that the route One problem with the networking stack in cache and the routing table are not kept syn- Linux is the lack of support for multiple chronized most of the time (path MTU, just forward-ing information bases (multi-FIB) wit to name one). The route cache flush is exe- h overlapping interface’s IP address, and the cuted regularly; therefore, any updates on the lack of appropriate interfaces for addressing cache are lost. For example, if you have a rout- FIB. Another problem with the curren t imple- ing cache flush, you have to rebuild every route mentation is the limited scalability of the rout- that you are currently talking to, by going for ing table. every route in the hash/try table and rebuilding the information. First, you have to lookup in The solution to these problems is to provide the routing cache, and if you have a miss, then support for multi-FIB with overlapping IP ad- you need to go in the hash/try table. This pro- dress. As such, we can have on differe nt cess is very slow and not predictable since the VLAN or different physical interfaces, inde- hash/try table is implemented wi th linked list pendent network in the same Linux box. For and there is high potential for collisions when a example, we can have two HTTP servers serv- large number of routes are present. This design ing two different networks with potentially the is suitable fo r a home PC with a few routes, but same IP address. One HTTP server will serve it is not scalable for a large server. To support the various routing requirements the increasing popularity of Linux as a desk- of server nodes operating in high perfor- top platform, the risk of seeing viruses or Tro- mance and mission critical envrionments, jans developed for this platform are rapidly Linux should support the following: growing. To alleviate this problem, the sys- tem should prevent on run time the execu- tion of un-trusted software. One solution is • Implementation of multi-FIB using tree to digitally sign the trusted binaries and have (radix, patricia, etc.): It is very impor- the system check the digital signature of bina- tant to have predictable performance in in- ries before running them. Therefore, untrusted sert/delete/lookup from 10.000 to 500.000 (not signed) binaries are denied the execution. routes. In addition, it is favourable to have This can improve the security of the system the same data structure for both IPv4 and by avoiding a wide range of malicious bina- IPv6. ries like viruses, worms, Trojan programs and • Socket and interfaces for addressing backdoors from running on the system. multi-FIB. DigSig [13] is a Linux kernel module that • Multi-FIB support for neighbors (arp). checks the signature of a binary before running it. It inserts digital signatures inside the ELF binary and verifies this signature before load- Providing these implementations in Linux will ing the binary. It is based on the Linux Security affect a large part of net/core, net/ipv4 and Module hooks (LSM has been integrated with net/ipv6; these subsystems (mostly network the Linux kernel since 2.5.X and higher). layer) will need to be re-written. Other areas will have minimal impact at the source code Typically, in this approach, vendors do not sign level, mostly at the transport layer (socket, binaries; the control of the system remains with TCP, UDP, RAW, NAT, IPIP, IGMP, etc). the local administrator. The responsible ad- ministrator is to sign all binaries they trust with As for the availability of an Open Source their private key. Therefore, DigSig guarantees project that can provide these functionalities, two things: (1) if you signed a binary, nobody there exists a project called "Linux Virtual else other than yourself can modify that binary Routing and Forwarding" [12]. This project without being detected. (2) Nobody can run a aims to implement a flexible and scalable binary which is not signed or badly signed. mechanism for providing multiple routing in- stances within the Linux kernel. The project There has already been several initiatives in has some potential in providing the needed this domain, such as Tripwire [14], BSign [15], functionalities, however no progress has been Cryptomark [16], but we believe the DigSig made since 2002 and the project seems to be project is the first to be both easily accessible to inactive. all (available on SourceForge, under the GPL license) and to operate at kernel level on run 11.3 Run-time Authenticity Verification for Bi- time. The run time is very important for Car- naries rier Grade Linux as this takes into account the high availability aspects of the system.

Linux has generally been considered immune The DigSig approach has been using exist- to the spread of viruses, backdoors and Tro- ing solutions like GnuPG [17] and BSign (a jan programs on the Internet. However, with Debian package) rather than reinventing the wheel. However, in order to reduce the over- such as multithreaded architectures, imple- head in the kernel, the DigSig project only took menting efficient POSIX interfaces, or improv- the minimum code necessary from GnuPG. ing the scalability of existing kernel routines. This helped much to reduce the amount of code imported to the kernel in source code of the One possible solution that is adequate for car- original (only 1/10 of the original GnuPG 1.2.2 rier grade servers is the Asynchronous Event source code has been imported to the kernel Mechanism (AEM), which provides asyn- module). chronous execution of processes in the Linux kernel. AEM implements a native support DigSig is a contribution from Ericsson [5] to for asynchronous events in the Linux kernel the Open Source community. It was released and aims to bring carrier-grade characteristics under the GPL license and it is available from to Linux in areas of scalability and soft real- [8]. time responsiveness. In addition, AEM offers event-based development framework, scalabil- DigSig has been announced on LKML [18] but ity, flexibility, and extensibility. it not yet integrated in the Linux Kernel. Ericsson [5] released AEM to Open Source in 11.4 Efficient Low-Level Asynchronous Event February 2003 under the GPL license. AEM Mechanism was announced on the Linux Kernel Mailing List (LKML) [20], and received feedback that resulted in some changes to the design and im- Carrier grade systems must provide a 5-nines plementation. AEM is not yet integrated with availability, a maximum of five minutes per the Linux kernel. year of downtime, which includes hardware, operating system, software upgrade and main- tenance. Operating systems for such systems must ensure that they can deliver a high re- 12 Conclusion sponse rate with minimum downtime. In ad- dition, carrier-grade systems must take into account such characteristics such as scalabil- There are many challenges accompanying the ity, high availability and performance. In car- migration from proprietary to open platforms. rier grade systems, thousands of requests must The main challenge remains to be the availabil- be handled concurrently without affecting the ity of the various kernel features and mecha- overall system’s performance, even under ex- nisms needed for telecom platforms and inte- tremely high loads. Subscribers can expect grating these features in the Linux kernel. some latency time when issuing a request, but they are not willing to accept an unbounded response time. Such transactions are not han- dled instantaneously for many reasons, and it References can take some milliseconds or seconds to re- ply. Waiting for an answer reduces applica- [1] PCI Industrial Computer Manufacturers tions abilities to handle other transactions. Group, http://www.picmg.org Many different solutions have been envisaged to improve Linux’s capabilities in this area us- [2] Open Source Development Labs, ing different types of software organization, http://www.osdl.org [3] Carrier Grade Linux, [19] An Event Mechanism for Linux, Linux http://osdl.org/lab_activities Journal, July 2003

[4] Service Availability Forum, [20] AEM announcement on LKML, http://www.saforum.org http://lwn.net/Articles/45633

[5] Open System Lab, http://www.linux.ericsson.ca Acknowledgments [6] Transparent IPC, http://tipc.sf.net Thank you to Ludovic Beliveau, Mathieu [7] Asynchronous Event Mechanism, Giguere, Magnus Karlson, Jon Maloy, Mats http://aem.sf.net Naslund, Makan Pourzandi, and Frederic Rossi, for their valuable contributions and re- [8] Distributed Security Infrastructure, views. http://disec.sf.net

[9] MontaVista Carrier Grade Edition, http://www.mvista.com/cge

[10] Make Clustering Easy with TIPC, LinuxWorld Magazine, April 2004

[11] IETF ForCES working group, http://www.sstanamera.com/~forces

[12] Linux Virtual Routing and Forwarding project, http://linux-vrf.sf.net

[13] Stop Malicious Code Execution at Kernel Level, LinuxWorld Magazine, January 2004

[14] Tripwire, http://www.tripwire.com

[15] Bsign, http://packages.debian.org/bsign

[16] Cryptomark, http://immunix.org/cryptomark.html

[17] GnuPG, http://www.gnupg.org

[18] DigSig announcement on LKML, http://lwn.net/Articles/51007