white paper

1

ENEA Multicore: High performance packet processing enabled with a hybrid SMP/AMP OS technology

Patrik Strömblad, Chief System Architect OSE, Enea

The multicore technology introduces a great challenge to the entire software industry, and it causes a significant disruption to the embedded device market since it forces much of the existing software to be redesigned according to new, not yet established principles.

Abstract distribution scalability to multicore de- IP packets. In most cases, network Moore’s law does still hold, but processor vices has proven to be a fairly straight- traffic processing can be divided in two vendors are rapidly turning to use forward task as it preserves the existing major categories; slow path processing the additional transistors to create software architecture investments. and fast path processing: more cores on the same die instead n Slow path processing: This involves of increasing frequency since this not Introduction network protocol control signaling only gives more chip performance but The embedded software industry is to configure and establish the data also decreases the power consumption facing the challenge to really start to paths, and it also involves packets (watt/mips). think and design in parallel, in order that terminate on this node in This paper starts by describing to be able to fully make use of the higher protocol layers that uses the the widely accepted multiprocessing new multi-core devices. Up until now, socket interface. software design models, and some we have been able to gain higher n Fast path processing: This involves of their benefits and drawbacks. After performance solely by upgrading the IP forwarding and other kinds of that, a few simple packet processing processor, but there is no “free lunch” intermediate processing of the use cases are described, aiming to il- any more. This rises a great challenge actual data packets such as NAT, luminate the pain points of a strict AMP also on the OS side, as the requirement security, etc. It is critical to minimize multiprocessing approach. Finally, we to parallelize also applies within the overhead in fast path processing introduce Enea’s OSE® Multicore Edition OS. The goal for a multicore RTOS must since the I/O bandwidth is dominant (Enea OSE MCE), and show how this be to provide excellent support for the and the CPU cycle budget for soft- new hybrid SMP/AMP RTOS technology application to maximize its performance ware execution is limited. can provide a homogeneous, scalable and scale with more cores, and at the and portable application framework same time maintain standard RTOS real- IP forwarding packets may still be part for high-speed packet processing time characteristics such as determinism of the slow path if it, for any reason, applications within the data- and and interrupt latency. The RTOS must need more cpu processing budget than connectivity layer, while at the same provide a simple, flexible and uniform allowed in the fast path. The categori- time being a feature-rich SMP RTOS for programming environment that offers zation of slow and fast path is a way to networking control protocols. Enea OSE capabilities such as load balancing, boot categorize the cost of processing rather MCE defines a very low-level multicore loading, file system and networking. than the functionality. processor abstraction model which The ongoing convergence between gives high portability and at the same the telecom and datacom domain are time it provides a low-overhead device creating extreme requirements on the programming model. The migration nodes in the networks to be able to of Enea OSE applications designed for handle very high bandwidth of primarily

Enea is a global software and services company focused on solutions for communication-driven products. With 40 years of experience Enea is a world leader in the development of software platforms with extreme demands on high-availability and performance. Enea’s expertise in real-time operating systems and high availability shortens development cycles, brings down product costs and increases system reliability. Enea’s vertical solutions cover telecom handsets and infrastructure, medtech, industrial automation, automotive and mil/aero. Enea has 750 employees and is listed on Nasdaq OMX Nordic Exchange Stockholm AB. For more information please visit enea.com or contact us at [email protected]. www.enea.com white paper

2

The slow path is not subject for discus- like and Windows provides a is that the OS provides no support to sion in this paper since TCP/IP stacks are best-effort execution platform for these the distributed application for load generally not multicore-adapted. The kinds of CPU-intensive applications. balancing or OS resource management. fast path category is the area which has The high degree of hardware The configuration, load and startup of the highest bandwidth increase, and resource abstraction is in many cases an such an application is also inherently therefore many people seek ways to advantage, but the layer introduces complex to design. offload these features from the legacy substantial overhead when the appli­ stacks and create new design models cation becomes as I/O intensive as ”Bare Metal” Model for them on multicore devices. they tend to be in embedded packet The ”Bare Metal” model is a single The next section describes the forwarding/routing applications. The threaded execution model where the common multiprocessing variants that principles of the shared memory available API:s are processor-vendor are subject for evaluation in discussions programming model on the application specific. Since no regular RTOS exist for around multicore software solutions. level and inside the Linux kernel is these threads, a common approach is based on using mutable shared objects to run a regular operating system on Various Multicore Processing in memory, and this is an inherent one or several cores, like Linux, and let Models in SW system design bottleneck to scalability in multicore the rest of the cores execute a “bare- There are fundamentally three multi­ systems. This will inevitably lead to poor metal” thread and use an application processing models that are used to scaling to many cores. framework that creates an abstraction describe system designs on multicore This, together with the fact that the of the hardware layer. The advantage devices; the SMP, AMP and the bare complex SMP implementation of kernels here is of course that maximal perform­ metal model. These models have a in many cases has the drawback of not ance and minimal overhead is achieved number of benefits and drawbacks, being deterministic, makes the classic SMP when running without an RTOS, but which will be briefly described below. OS:es less suitable as a RTOS for high-speed the disadvantage is though that the packet processing in the long run. software becomes hardware-specific, The SMP Model which will force a redesign of any The Symmetric Multi Processing model The AMP Model applications whenever the hardware is is the model used in the design of The Asymmetric Multi Processing model upgraded. Also, the parts of the system several enterprise OS:es such as Linux uses an approach where each core is running without an RTOS or application as well as in the design of its application running its own complete, isolated, framework will take on the role of a domain. In such OS:es and their appli- operating system or application frame- black box, i.e. there will be no observ­ cations data is to a large extent shared, ­work (an alternative term for a more ability except for the external interfaces. and a number of different locking light-weight RTOS). This leaves the door Any support for tracing, debug or post mechanisms and atomic operations open to also choose to have different mortem dump support is not available, are used frequently for synchronization. RTOS:es on different cores. The advantage and therefore the amount of code “out The SMP model is easy to manage from of an AMP system is that high-perform­ there” must therefore naturally be kept a SW management perspective since ance is achieved locally and that it to a minimum. Over the time though, it creates a good abstraction where scales well to several cores. Using the the need for more functionality in these the OS facilitates best-effort cpu load AMP model and virtualization technique parts will most likely grow which in turn balancing, and it has been used in the is also a way to being able to reuse increases the need for better device server and desktop application domain legacy single core designs. abstraction. for a very long time. Enterprise OS:es The drawback with the AMP model Case study: Packet processing use cases using an AMP system design The AMP model is not new to the embedded world. It has been deployed in many systems where DSP:s or network processors are put together as multi- core clusters, dedicated to perform a specific task. Some examples of use cases applic­ able to the AMP model: Functional pipelining through cores Parallel, symmetric processing on cores

Packet Processing Functions

Figure 1. AMP processing models. white paper

3

n A set of Digital Signal Processors In many cases these flows has states OS provides no support for state (DSPs) that perform proces- (e.g. multimedia transcoding, IP tunnel­ migration between cores. Adding sing like transcoding on flows of ing), which means that all packets part support for state check-pointing, packets. of a flow must be “pinned” and passed thread migration and balancing of n A set of network processors that in-order through the same sequence of packet flows can be quite complex. performs IP fast path processing. cores due to the location of the state. In The application is also responsible n A set of line cards (device processor other cases, such as for example IP for- for programming the device layer boards) with general purpose warding or NAT, the flow can be quite and administrating its state, and processors performing layer 2/3 stateless and thus most packets can be such a service is also quite compli­ packet processing like IP forwarding,­ processed entirely and individually by cated to design distributed. This tunneling, encryption/decryption, any core. means that the task to quickly and etc, but also slow path processing. A high-performance packet proces- atomically reprogram the device n A set of legacy line cards running in sing system design may include both layer in order to re-balance the a virtualized AMP environment as pipeline flows and “run-to-completion” incoming packet traffic on the hard- guest environments. processing. ware layer becomes very complex. In a multicore AMP system there is A packet processing application may be no single OS that manages the whole There are a number of other areas that designed as a flow or functional pipe- set of hardware and software resources needs attention when designing an line where each core has a dedicated on the device. Consequently, the task AMP system, such as load and startup, task, or it may be designed to dispatch to create, and maintain, the complex Operation and Management (O&M), the processing of the packets equally resource management layer is then left debug, etc, but these areas are not on all cores. To minimize overhead and to the system designer. In many cases further addressed in this paper. maximize CPU usage to packet proces- we now end up with a quite static In short, AMP system designers sing in such cases, an execution model system with no or little support for load encounter challenges in the following called the “run-to-completion” model is balancing either in the packet ingress areas: often used. This is basically a never- or of the packet processing flow itself. n load Balancing ending loop that fully processes each Examples of such problem areas are: n Memory Management packet and sends it out before taking n Memory Management: AMP means n OS resource management on the next. that even if the cores can access n load and Start n Functional pipelining: Packets flows global memory, this need to be are allocated to cores according to partitioned statically at boot time. Enea OSE Multicore Edition some algorithm controlled by the If a more flexible and dynamic The innovative asymmetric, hybrid slow path control layer. Each step in memory management is desired, a kernel architecture in Enea OSE Multi- the pipeline performs a dedicated distributed memory service must core Edition is one answer on how to task, and passes the packet on to be developed by the system designer. gain the advantages of the above the next step This area becomes more and more mentioned multicore processing models. n Parallel symmetric processing, or significant as the functionality This chapter describes the Enea “run-to-completion”: All cores are and code size grows. The transient OSE Multicore Edition, which is a new symmetric and can handle all tasks memory needs becomes more RTOS architecture specifically designed within a flow entirely within one and more indeterministic and the for high-performance multicore core’s thread of execution. system may not be fully utilized. application use cases in the data plane, n load balancing: Each of the cores including both processing of user have an own OS instance, and the data and of connectivity layer control signaling.

OSE Application OSE Application OSE Application Architecture Overview Enea OSE is a truly distributed operating Core system that uses a message based File IP Stack Load Distributed Runtime Extensions Managers Balancer IPC (LINX) Loader programming model that provides Tools Core Basic File System IP Network /C++ Program Device Driver application location transparency. Enea Services Services Services Runtime Management Management OSE is a micro-kernel architecture, see figure 2. Kernel Kernel Services Memory Management, IPC, Scheduler

Hardware Abstraction Layer

Figure 2. The OSE micro-kernel architecture. white paper

4

The Enea OSE architecture is a very C++ run-time library where parts of advantages of the previously above modular and scalable architecture, the POSIX API are included. Examples mentioned models without having to consisting of a large set of run-time of such services are the File System deal with the disadvantages. The key components that runs on top of a Managers and the IP stacks, which run benefit with this RTOS model is that it micro-kernel. System builds can range on a single processor node, while the creates a new multicore aware level of from being a small, simple kernel layer API is available to client applications on CPU and device abstraction, which puts that for example can act as a all processors via a C/C++ run-time fun- applications designed with this in mind or a simple executive for “run-to-com- ction library that uses message passing “closer” to the actual hardware and pletion” applications, up to a full RTOS to reach the operat­ing system servers. minimizes overhead. This is the differ­ supporting POSIX file systems, SMP An example of this is the call to fread() ence to the classic SMP model which threading and IP networking. that use internal message passing creates a higher level of abstraction which The Enea OSE kernel has been toward the file system server , in turn adds too much I/O overhead. developed based on the exchange of which can be located anywhere in the The Enea OSE Multicore Edition messages between processes (which in system even when the system is spread kernel looks similar to SMP when it Enea OSE is the equivalent to POSIX geographically. comes to simplicity, flexibility, application threads). This mechanism for inter process The OSE programming model transparency, and error tracing. Enea communication (IPC) is the founda- encourages an object oriented, parallel OSE MCE looks similar to AMP when it tion of the OSE programming model, design of applications where each comes to scalability, determinism and and it is implemented as a simple API process use its own private memory, performance. Enea OSE MCE does not for exchange of messages between and where message passing is the main either suffer from extra load due to that processes/threads in a distributed system mechanism for exchanging data and for several cores uses shared memory. In running on a single, or several, processor synchronization. Using the message other words, the performance level nodes. Enea OSE also provides an add- passing programming model as the is the same as it would be if using ressing model that enables application foundation for parallelization and each core on its own. Further more, scalability, making it possible to let a synchronization has proven to ease the since the OSE application framework system run on a single processor node transition to the multicore technology. allows a thread to access I/O devices or several nodes in a distributed cluster When an application is already designed and memory in supervisor mode, we without changing the programming code. to be parallel for distribution scalability, can create a sandboxed design that When processors are physically the migration to multicore devices basically achieves the bare-metal divided, OSE kernels use the IPC protocol­ becomes a straightforward task, and model performance. In such a design, Enea® LINX for passing messages. Enea investments in expensive architect­ural the packets can be processed without LINX is a kernel concept used for the changes are not needed. involving the operating system at all implementation of a message passing except for the dispatching of the user- back plane that is adaptable for dif- Enea OSE Multicore Edition: installed interrupts and yield points (At ferent media. The innovative asymmetric, certain points, when queues are empty Services in OSE are mainly imple- hybrid kernel technology or on long intervals, the context may be mented according to the client-server The new, asymmetric OSE kernel preempted) All together this allows for model, providing a distributed C/ Multicore architecture combines all the maximizing CPU resources for process­ ing packets in the applications and Application minimizing unwanted overhead. More Thread about this later. The micro-kernel architecture and Kernel Interrupt Interrupt the message passing model allows for Thread 3 Service Service Thread Thread operating system services such as Loaders, Memory Managers, IP stacks, and File Kernel Systems to be located on different Thread 2 Packet Packet Application Application cores, while applications can access Kernel Thread Thread these services regardless of location in Thread 1 the system (location transparency), achieving shared access of services on the application level like in the SMP model. Core 0 Core 1 Core 2 The principle of the single-image

Kernel Event Backplane RTOSE Image

Figure 3. The hybrid SMP/AMP kernel technology. white paper

5

RTOS micro-kernel design is shown in a message (system call send()). n Full operating system API (all figure 3. The advantages with this hybrid operating system calls and debug On each core the OSE kernel kernel model are: features) to applications on all cores instantiates a scheduler with associated n That linear performance scalability is provided. data structures. The RTOS single micro- of applications that are inherently kernel image provides a set of system parallel will be reached. The procedure for sending a message calls to all threads running on all cores. n The use of spinlocks for synchron­ contains the following steps: On the “master” core 0 the RTOS runs a ization within system calls that 1. A process/thread allocates a message set of kernel threads (Kernel thread 1-3) need to modify the scheduler data buffer from a global, shared message that have the responsibility to admin­ structures or process control block buffer pool. The free-list of the pool istrate common kernel resources such are entirely eliminated. Threads on is the only shared memory structures as pcb:s, the entire system memory, a core only use the much cheaper accessed in this sequence. etc. There may also be the control layer core local interrupt locking mech­ 2. The process performs the system application thread on core 0. On other anism to synchronize the local call send() specifying a destination cores there may be interrupt service scheduler with local core interrupt address (PID) and a message buffer. threads and various packet application handling. 3. The send() call reads the destination threads. The circular flow in the figure n The use of an optimized lock-free core from the destination address, aims to illustrate that threads and algorithm used to allocate and free and posts a kernel event containing interrupt routines are synchronized message buffers from the global, the message buffer pointer and the internally on a core as “a core loop”. The shared buffer pool is centralized. destination address to the destin­ threads on a core does not, directly or n Efficient and optimized usage of at­ion core. (Kernel event may use indirectly, update data that belongs to the cache since the contention hardware accelerators to transport any other core. caused by kernel-internal false the event). One important design goal with the sharing and other “cache-line 4. The send call returns and the thread separated kernel scheduler instances bouncing” effects is eliminated. returns and continues with other tasks is exactly this; to avoid the need for System calls will not operate on while the kernel event propagates synchronous cross-core modifications other cores data structures except in parallel to the destination core. of the scheduler data structures on in rare cases. 5. The destination core either takes an another core while executing local n Optimization for low overhead ISR to get the event or polls it in system calls. To avoid this a concept in inter-core messaging by using (requires some level of periodic called kernel event has been invented, asynchronous inter-core kernel yields in the application). This ISR which is a very light-weight kernel event queues. This will ensure high then performs the core-internal internal IPC. It is used to perform all application throughput performance. transaction towards its scheduler data kinds of asynchronous, cross-core Where applicable, use hardware structure which involves queuing transactions. A kernel event is basically support to implement the inter-core the message buffer and the proper a meta object that contains an action queues in the BSP where possible state transition. This transaction only to perform to a core’s scheduler data in order to accelerate kernel events. uses the much cheaper interrupt structures, such as for example its ready Some newer devices provide a lock synchronization for the reason queues. The “meta” object potentially programmable hardware queues described above. also contains a message pointer to a that can be used for this purpose. 6. The destination process/thread is shared buffer if the transaction involves then eventually scheduled, and reads in the message buffer using the Packet Application Packet Application receive() system call. It then either Thread 1 Thread 2 frees the message or resends it. 2 post 1 alloc 6 receive Note that all processes/threads in an OSE Multicore Edition system can Shared P2 ptr still make full use of the operating 3 post 4 Buffer Pool RTOSE system services. Since each scheduler 5 Image is isolated, just like in an AMP system, most system calls are designed inside P2 ptr BSP the kernel without using spinlocks or 4 atomic transactions that uses global

Core 0 Core 0

Figure 4. The sequence of the send system call. white paper

6

shared data. In rare cases where system OSE Multicore Executive – Bare Metal Environment calls need to perform operations on global data structures a global lock needs to be acquired. An example is when a group of threads (a program) is Application Packet Packet Thread 2 Application Application moved from one core to another. Thread 1 Thread 2 Figure 5 shows an example of an Application eight core system design that runs a Thread 1 homogeneous OSE Multicore Edition system. Kernel In the system in figure 5, the scheduler Thread 2 on core 0 runs the RTOS services and Kernel all other cores run the data plane Thread 1 BG1 BG2 application, which is designed as a flow, or a “functional pipeline”, across cores. Potentially core 1-3 can terminate Core 0 Core 1 Core 2 the gigabit Ethernet devices and pass the incoming packets on to a flow of RTOSE Image layer2/3 IP processing by zero-copy OSE messages between cores either Slow Path Domain Fast Path Domain containing the payload or by passing the pointer to packets which then are Figure 6. The “bare-metal” sandboxes and a slow-path control domain in a homogeneous administrated in HW queues. Core 4-7 environment. are running busy-looping processes/ threads that consumes data buffers, since cores execute different code. Note network protocol processing software process deep packet inspection, that the effect of this differs depending must be designed to eliminate un- encryption, or decryption, and then on whether of not the L2 cache is per necessary overhead spent in spinning, pass the buffers on to the next in the core or shared. context switching or waiting for some pipeline. Finally, a buffer is sent to one OS resource. All CPU cycles must be spent of the outgoing threads where it is How to reach “bare metal” for packet processing since the cycle forwarded out by ISR execution on performance with Enea OSE budget in these cases is very limited. the same core as the device or by the The packet processing bandwidth on On core 0 the control layer and looping process/thread. advanced multicore devices like for slow path services are located, and the A system design that uses functional example Freescale® QorIQ P4080 or O&M IP traffic is terminated. On each pipelining does normally utilize the L1 Cavium® OCTEON is extremely high of the “fast path” cores there is in this and L2 caches better, since the instruc- and puts very high requirements on the example located a set of processes/ tion set on each core becomes smaller packet processing software design. The threads. One of these runs the packet processing main loop. In this example, Ethernet Devices IP Acceleration Devices the application is of the “run-to- completion” type to illustrate the most Device Layer demanding case. There are several

Packet Reception and other examples of such “bare-metal” Transmission Layer execution environments, named “SE, Simple Executive” or “LWE, Light-Weight Executive”. These have in common that

Packet Processing they are basically a set of libraries that Layer abstracts the advanced hardware features, such as packet queues, encryption/ decryption engines, etc. These HW OS Call API vendor specific libraries are intended RTOS Services FS, IP, etc. to be used in a polling loop. In other system designs, there Core 0 Core 1 Core 2 Core 3 Core 4 Core 5 Core 6 Core 7

Shared Buffer Pool

OSE Multicore Kernel

Figure 5. Example of an 8-core system with functional pipelining design. white paper

7

may be applications that are designed incoming control or data messages. system on the multicore­ device becomes as flows which potentially are more Naturally, these potential scheduling homogeneous, a fact that saves cost and complex and indeterministic compared points can be allowed when no packets complexity when maintaining the system. to plain IP forwarding. Most likely these are in the queue, but also with high In general, the coexistence with Linux designs will benefit more from a better traffic but with low frequency. The and OSE is based on Enea LINX message support from the RTOS, even SMP-like interrupt framework in OSE is very light- exchanges between the Linux “slow- features if possible. weighted. path” networking layer and the “fast-path” Enea OSE Multicore Edition provides, This design allows for a very efficient packet processing application on the equal to an SMP enterprise OS, an advanced fast path packet processing, but also for OSE side. MMU management as well as a feature- a mix between fast path and slow path The figure 7 illustrates how the net- rich POSIX API. If the underlying CPU processing. Enea OSE also provides a working stack and application on the architecture supports variable TLB deterministic heap manager, where the linux side controls the packet processing page size, the OSE MMU Manager will heap is core-local but will allocate memory application on the OSE side using pro- utilize this which basically will provide a from the global Memory Manager on tocols over LINX, and how “slow-path” situation where the entire working set core 0. packets are sent to the stack on the of code and most data are mapped The powerful Memory Manager in Linux side using LINX message to carry by the TLB:s. Since TLB thrashing is Enea OSE provides support to dynam­ the payload. avoided, the MMU management will ically create shared virtual memory, as There are a number of solutions to not contribute with any overhead. well as to map external memory into have Linux as the control OS on one or On each core we can then create the virtual space, for example memory a few cores on a multicore device and an OSE program, which basically shared with a Linux environment. While a single-instance OSE on the remaining corresponds to a UNIX process and providing support for “bare-metal” cores: consists of a MMU domain and a set of performance on traffic cores, one or n linux running in a “true” virtual threads. A supervisor thread that runs a a few cores can run higher layers of machine on OSE. busy-loop locally on a core is basically network­ing protocol control software n linux running “paravirtualized” equal to a “bare-metal” thread. The OSE as well as O&M software without inter- using HW virtualization support. thread then becomes a sandboxed fering with the traffic cores. In this case OSE implements the execution environment that benefits hypervisor layer according to for from no OS overhead at all while busy example the PowerPC hypervisor looping, provided that the thread is not Linux as control layer OS and Enea call set. preempted by any higher prioritized OSE as “fast path” application n linux running natively on one or a thread or interrupt on that core. The framework few cores. main “run-to-completion” loop will then The Enea OSE product family contains consume and process fast path packages a large set of IP networking products, Conclusion via vendor specific HW-support libraries including an IPv4/v6 stack. Enea OSE Various software models have been (potentially with interrupts locked). The may in many cases support enough proposed by the industry in order to main loop decides when and how often features regarding control and O&M address the multicore challenge, all it may yield, allow interrupts or poll for functionality. In these cases the eco- of them having different benefits and drawbacks. Enea OSE Multicore Edition, having a kernel design based on an innovative SMP/AMP hybrid technology, Application attempts to combine the best of the multiprocessing model advantages using the message passing program- ming paradigm for the application level as well as for the kernel internally. NET IP Forwarding IP Forwarding Legacy applications that use this model of programming and parallelization have been proven to be easy to migrate to new multicore devices. Enea OSE Multicore Edition clearly Core 0 Core 1 Core 2 challenges the industry-perceived dif-

Linux Domain OSE Domain

Figure 6. Linux and OSE. white paper

8

ficulty to get linear scalability equal to the Multicore Edition architecture n Create a complete separation AMP system performance while still ha- described above. between threads on different cores ving an SMP application model. When This design is expected to offer linear regarding spinlocking and hidden used on processors that have hardware performance scalability for future multi­ memory sharing in order to maintain inter-core messaging support, the OSE core devices with up to 8-16 cores and more. deterministic behavior. kernel can also utilize this to accele- In summary, Enea OSE Multicore n Use a dynamic, global memory rate the inter-core kernel transactions, Edition provides support to: management as well as MMU which in turn maximizes application n Define a homogeneous system that protection support in order to performance. The Enea OSE Multicore can mix control functionality with create fault isolation and fault Edition primarily targets systems within packet processing applications. recovery domains. the domain of I/O intensive data plane n Define an application framework n Debug and observe the system via applications, including both user data in order to create a sandbox for a Enea® Optima Tools, including the processing and control signaling. “bare-metal” packet loop. “bare-metal” threads. The Enea OSE Multicore Edition n Use a low-level multicore CPU release, available in Q2 of 2009, will be abstraction model without the designed completely according to overhead of classic SMP OS:es.

Enea®, Enea OSE®, Netbricks®, Polyhedra® and Zealcore® are registered trademarks of Enea AB and its subsidiaries. Enea OSE®ck, Enea OSE® Epsilon, Enea® Element, Enea® Optima, Enea® Optima Log Analyzer, Enea® Black Box Recorder, Enea® LINX, Enea® Accelerator, Polyhedra® Flashlite, Enea® dSPEED Platform, Enea® System Manager, Accelerating Network Convergence™, Device Software Optimized™ and Embedded for Leaders™ are unregistered trademarks of Enea AB or its subsidiaries. Any other company, product or service names mentioned above are the registered or unregistered trademarks of their respective owner. WP50 022010. © Enea AB 2010.