Engineers’ Guide to FPGA & PLD Solutions

FPGAFPGA HighsHighs andand LowsLows (Density,(Density, ThatThat Is)Is)

FPGAs Enable a New SoC FPGA Relies on Solving Signal Class of Always-on Many Cores and 14 nm Conditioning and Applications Tri-Gate Process Processing Challenges in Multi-SensorApplications with FPGAs

www.eecatalog.com/fpgawww.eecatalog.com/fpga

Market Sponsor: Gold Sponsors New—56 Million ASIC Gates

This new Virtex 7 based board is the biggest and fastest ASIC Prototyping platform— ever. All the features you need to run your big fast design are on this off-the-shelf logic prototyping system that captures all the capabilities of the new Xilinx 7V2000T. 14 million ASIC gates/chip, 1.4 Gb/sec LVDS, 1200 I/O per chip, and new software tools to get your RTL running faster, are all features of these 2.5D stacked silicon FPGAs from Xilinx. Four of these massive chips plus configuration FPGA and a Marvell Dual CPU for data handling, make it easier to get your design up to speed.

Is your design still too big? Seamlessly stack two boards together (a first for DINI) and 112 million gates are 100% user available. Need special functions, added memory, or signal processing? Eight daughter cards can support your unique requirements. There is an optional chassis available to ensure a robust and properly cooled system, because your design is going to run “hot”.

Call DINI today, this surprisingly low priced board is now shipping.

7469 Draper Ave. • La Jolla, CA 92037-5026 • Phone: (858) 454-3419 • Fax: (858) 454-1728 • www.dinigroup.com Engineers’ Guide to FPGA and PLD Solutions 2014 www.eecatalog.com/fpga Welcome to the 2014 Vice President & Publisher Clair Bright [email protected] Engineers’ Guide to FPGA (415) 255-0390 ext. 15 Editorial and PLD Solutions Vice President/Chief Content Officer John Blyler [email protected] (503) 614-1082 Editor-in-Chief According to a recent report by Transparency Market Research, the Chris A. Ciufo FPGA market, valued at USD 5.08 billion in 2012, is projected to reach [email protected] Managing Editor USD 8.95 billion by 2019. The top three drivers won’t surprise anyone: Cheryl Coupé demand for bandwidth in wireless networks as LTE implementation [email protected] expands, use of FPGAs in the flat panel displays common to every Creative/Production gadget we can’t keep our hands off of, and rising electronic content Production Manager in automobiles. But while there may not be many surprises in what’s Spryte Heithecker Graphic Designers driving market demand, the technology approaches of the industry’s top Nicky Jacobson players are definitely interesting. Jacob Ewing Media Coordinator Yishian Yao Editor-in-Chief Chris Ciufo takes you deep inside the exciting new Production Assistant announcements from both Xilinx and Altera. In “Xilinx UltraScale Jozee Adamson Moves FPGA from ‘Critical’ Accelerator to ‘Central’ SoC,” the company Senior Web Developer Slava Dotsenko claims that the new 20nm UltraScale family is so smart that it Mariam Moattari brings ASIC-like system functionality to ultra-dense and super high- Advertising/Reprint Sales performance designs. Chris asks all the tough questions, but ultimately Vice President & Publisher he admits that the Xilinx roadmap might keep them at the center of Embedded Electronics Media Group Clair Bright customers’ design for several more product nodes. What’s the flip [email protected] side? Altera’s recent Stratix 10 announcement. In “SoC FPGA Relies (415) 255-0390 ext. 15 on Many Cores and 14 nm Tri-Gate Process,” Chris explains Altera’s Sales Manager Michael Cloward “unimaginable” performance. Who’s the ultimate winner? Read on, [email protected] and make your own conclusions. And if you’re looking for a different (415) 255-0390 x17 approach, Lattice makes an appearance as well, explaining the role Marketing/Circulation Jenna Johnson of new ultra-low-power devices in “FPGAS Enable a New Class of To Subscribe Always-on Applications.” www.extensionmedia.com/free

There’s plenty more inside these pages, from software toolchains to application deep dives, as well as data sheets, product information and Extension Media, LLC Corporate Office more. And check back regularly at www.eecatalog.com/fpga for more President and Publisher up-to-date information. Vince Ridley [email protected] Vice President, Sales Cheryl Berglund Coupé Embedded Electronics Media Group Clair Bright Managing Editor [email protected] Vice President, Chief Content Officer John Blyler [email protected] Cheryl Berglund Coupé Vice President, Business Development Melissa Sterling [email protected] Special Thanks to Our Sponsors

P.S. To subscribe to our series of Engineers’ Guides for embedded developers and engineers, visit: www.eecatalog.com/subscribe The Engineers’ Guide to FPGA and PLD Solutions is published by Extension Media LLC. Extension Media makes no warranty for the use of its products and assumes no responsibility for any errors which may appear in this Catalog nor does it make a commitment to update the information contained herein. The Engineers’ Guide to FPGA and PLD Solutions is Copyright ®2013 Extension Media LLC. No information in this Catalog may be reproduced without expressed written permission from Extension Media @ 1786 18th Street, San Francisco, CA 94107-2343. All registered trademarks and trademarks included in this Catalog are held by their respective companies. Every attempt was made to include all trademarks and registered trademarks where indicated by their companies. www.eecatalog.com/fpga 1 Contents

Xilinx UltraScale Moves FPGA From ‘Critical’ Accelerator to ‘Central’ SoC By Chris A. Ciufo, Editor-in-Chief...... 3 FPGA Highs and Lows (Density, That Is) By Cheryl Coupé, Managing Editor...... 6 Case Study: Location-Aware Public Transit Drives Custom I/O Controller Card Development By Bill Brown, Avail and Patrick Dietrich, Connect Tech...... 11 FPGAs Enable a New Class of Always-on Applications By Subra Chandramouli, Lattice Semiconductor...... 15 Solving Signal Conditioning and Processing Challenges in Multi-Sensor Applications with FPGAs By Ian Shearer, VadaTech, Ltd...... 17 Build or Customize an Open Source Software Toolchain? By Anil Khanna, Mentor Graphics...... 19 SoC FPGA Relies on Many Cores and 14 nm Tri-Gate Process By Chris A. Ciufo, Editor-in-Chief...... 22

Product Services

FPGAs & PLDs Boards & Kits FPGA Chips Design Kits Microsemi Corporation Xilinx, Inc. IGLOO2 FPGA...... 26 Artix-7 FPGA AC701 Evaluation Kit ...... 33 SmartFusion2 System-on-Chip FPGAs...... 27 Kintex-7 FPGA KC705 Evaluation Kit...... 34 Xilinx, Inc. Virtex-7 FPGA VC707 Evaluation Kit...... 35 Artix-7 FPGAs...... 28 ZedBoard ...... 36 Kintex-7 FPGAs...... 29 Zynq-7000 All Programmable UltraScale Devices...... 30 SoC ZC702 & ZC706 Evaluation Kits ...... 37 Virtex-7 FPGAs...... 31 FPGA Boards SoC Elma Electronic Inc. Xilinx, Inc. 3U VPX Front End FPGA Processor Zynq-7000 All Programmable SoC...... 32 Single Board Computer with Enhanced Graphics ...... 38 Pentek, Inc. Model 52730: 1 GHz A/D, 1 GHz D/A with Virtex-7 FPGA ... 39

Development Tools Design Tools Xilinx, Inc. Vivado Design Suite...... 40

2 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature Xilinx UltraScale Moves FPGA From ‘Critical’ Accelerator to ‘Central’ SoC New 20 nm UltraScale SoC FPGAs push Xilinx into ASIC territory, making them central to many new systems by mixing logic with ARM cores, “smart” IP, memory and analog mixed signal.

By Chris A. Ciufo, Editor-in-Chief

According to Xilinx, the 20nm UltraScale family is so smart that it brings ASIC-like system functionality to ultra-dense and super high-performance designs. Says Xilinx Senior VP of Corporate Strategy Steve Glaser: “FPGAs will not just be used for critical functions like accelerators and bridges, but now the way designers archi- tect a system affects how they will use Xilinx’s highly and software devices.” (Figure 1). In other words: UltraScale is a game-changer.

To get the 10,000-foot strategic view, we sat down with Steve Glaser, the architect of Xilinx’s corporate transformation. The company has “cheated Moore’s Law,” says Glaser “by notching Figure 1: The dramatic shift from FPGA (28 nm Virtex-7) to programmable ASIC-like SoC (20 nm Virtex Ultrascale). the roadmap several nodes ahead” (four nodes, according 1, Xilinx has raised the bar, made fundamental changes, or to Xilinx) using 20 nm TSMC CoWoS and second-genera- both. tion 3D stacked chip technology. FPGA Logic With a hint of smugness in his voice, Glaser proudly UltraScale’s poster boy “Big Daddy” Virtex UltraScale contrasts the 50 million equivalent ASIC gates and 4.4 mil- VU440 sports 50M equivalent ASIC gates, 1,456 user I/ lion logic cells of the class-leading, in-production Virtex Os, 48 x 16.3 Gbps transceivers among other eye-popping UltraScale VU440 to competitor Altera’s recent Stratix specs. The company claims the “…highest system perfor- 10 roadmap plans saying: “it [Virtex-7 2000T] was real at mance and bandwidth for single-chip implementations of 28nm and you’re seeing [UltraScale] is real right now at PCI Gen 3, 400G MuxSAR, 400G Transponder and 400G 20nm.” The first UltraScale device taped out in June 2013 MAC-to-Interlaken bridge”; it has upped the ante on myriad and first silicon was in November. Volume deliveries will in-logic functions including: DDR4 memory interfaces, 33 be sometime in 2014. Gbps capable transceivers, wider multipliers in DSP blocks, and a hardened 100 Gbps MAC (seven of them, in With 50% of revenues coming from high-bandwidth Big Daddy). communications infrastructure plus ultra-performance embedded computing (HPEC), the horsepower and integra- To create these functions and make real the claims, tion in UltraScale’s “ASIC class” SoC FPGAs are essential for fundamental changes were made in the silicon FPGA archi- Xilinx. In each of the areas shown on the right in Figure tecture. Interconnect bottlenecks used to be driven by

www.eecatalog.com/fpga 3 Special Feature

Figure 2: Using new UltraScale hardware resources, Vivado can maximize CLB utilization, freeing resources and improving power consumption and performance.

switching delays, but with 100 to 400 Gbps I/O, routing These are only a few of the fundamental architectural delay becomes dominant while clock skew across a larger changes with UltraScale. Others include a much larger die—which used to be a third-order effect—eats into block RAM, power gating, and faster memory I/O. timing margin. 3D Stacked-Chip IC Ultrascale addresses both of these by re-designing the Using second-generation stacked silicon interconnect (SSI), routing architecture and doubling the number of routing the company has achieved a 4x node jump at 20nm. In 2011 resources. In parallel, the routing congestion and wire- the company’s first SSI device achieved roughly twice the lengths are optimized further with the Vivado software. density of competing devices: the Virtex-7 2000T had 6.8 To combat and balance clock skew, there’s a new regional billion transistors, 2M logic cells and 20M equivalent ASIC ASIC-like segmented architecture that moves clock gates. Contrast this with the latest 50M equivalent gates. resources to 2,000 possible spots around the chip from the typical center location. Designers get finer-grained This is possible not only because of the process shrink from clock control with both timing and power benefits. Vivado 28nm to 20nm on TSMC CoWoS, but the multi-chip module can now even bias the skew in favor of certain designs to interposer technology now has 5x more inter-die bandwidth combat inherent delays. with a new unified clocking architecture across die slices. Xilinx claims this allows near-monolithic inter-die com- Lastly, Xilinx has targeted 90% utilization of UltraScale munication. For example, the VU440 described above uses at full performance. The company has rearchitected CLBs three dice whereas the previous largest Virtex-7 used four. by granting independent access between LUTs and flip- flops, and changed the slice boundaries for better access The SSI also facilitates analog mixed signal (AMS), sepa- to resources. As well, CLBs now have twice as many clock rate memory, and any other technology such as RF that can enables. Xilinx calls these changes “CLB packing” with the coexist on the interposer inside the package. Xilinx plans goal of max use of CLB resources (Figure 2). on using SSI for lower volume analog but a monolithic die for higher volume. The roadmap shows migration from cur- Vivado software has been tweaked to capitalize on these rent 1 Msps AMS “sensing IP” for temperature and security architecture changes and achieve that 90% target. An added anti-tamper, to a “programmable systems integration” benefit is 20% lower wire length, overall better FPGA logic mixed signal SoC. utilization, higher performance and lower power. It’s no wonder that Xilinx and TSMC boast how SSI cheats Moore’s Law at lower cost/package. Even better, SSI at 20nm will scale down to TSMC’s planned 16nm FinFET process

4 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature for “future proof” design migration. Although behind the Xilinx’s Glaser revealed that the company invested heavily transistor feature size process leaders like Intel’s tri-gate in two areas five years ago: 3D ICs and in the Vivado tools. (finFET) at 14nm, SSI appears to even the game somewhat. Xilinx spent $900M in Vivado “ASIC-strength” Design Suite 2013.4—a full $400 million more than the normal $500 Smarter Cores and More Cores million it takes on a new product node. This was essential Besides the process shrink and stacked chip implementa- to evolve logic cores to “smart” cores that ease the design tion, UltraScale makes many improvements in core system integration task and allow faster customer differentiation blocks. Says Dave Myron, Xilinx Sr. Director of FPGA without getting lost in the weeds of RTL. Product Management and Marketing: “for improved DSP performance in both mid-range Kintex UltraScale and Xilinx’s FPGA director Myron says: “It’s all about squeezing high-end Virtex UltraScale, multipliers were widened. more out of every square millimeter of die with 90 per- Double-precision floating point, for example, can be accom- cent utilization. Our UltraFast design methodology is plished in two blocks versus three as in the 28nm devices.” what takes the design from months to weeks” in terms of turn-around time. It is based on a unified data model A complex multiply-accumulate used in filter applications that’s designed to bring “design closure to the front end has gone from six blocks/slices to three by adding a feed- of the design flow where the impact on quality of results back loop in the logic. New pre-adder squaring allows video is greater.” The UltraFast best practices are listed in new smoothing to be done in half the number of slices. Wider documentation, incorporated into Vivado 2013.4, and are XOR blocks were added in the DSP slices, alleviating the rolling out to company partners and IP providers. need to burn FPGA logic when computing forward error correction algorithms. Beyond 20 nm But Xilinx isn’t just betting on UltraScale at 20nm. In fact, As mentioned above, transceiver I/O now runs at Xilinx’s Glaser argues that the company’s current 28nm backplane-capable 16.3 Gbps, suitable for adaptive auto products (Virtex, Kintex, Zynq) will stick around awhile, equalization. Depending upon the device SKU, transceiver because they have lots of life and offer lower wafer costs. count doubles to 65 (Kintex) or moderately increases to 20nm UltraScale is the current leading edge and gives 104 (from 96) on Virtex. PCI Express is now Gen 3 with designers levels of options for integration, types of things transaction layer bypass for virtualization functions, and that can be integrated (in logic, in the SoC, or in smart there’s hardened support for 100ms PCI Express configura- IP), and in tradeoffs for overall BOM system integration. tion time to enable whole-device power-on configuration. “$/GMAC or $/packet processing throughput, or optical interfaces, DDR4 or even software defined wired networks Interestingly, Xilinx is pushing the former mid-grade (SDN) can all be realized and cost effectively traded off,” Kintex into the sweet spot of high-end DSP, HPC/HPEC, says Glaser. and wired datacoms applications. Virtex will win designs in the ultra-high performance 400G-type wired/optical Finally, the future 16nm TSMC finFET process isn’t far infrastructure applications. While one UltraScale device away. It will give UltraScale more performance per watt and can save overall system cost by replacing more board-level creates even more heterogeneous multiprocessing architec- components inside the FPGA, lower-end UltraScale devices tures with reduced memory bottlenecks plus programmable will offer performance features previously found only in analog for the “ultimate smart programmable SoC.” With the highest-end Virtex-7 devices. Our point: your mileage this kind of roadmap, Xilinx might well remain central to may vary so choose wisely. most customers’ designs for several more product nodes.

By the way, at press time Xilinx has revealed scant infor- mation about plans for future ARM IP in UltraScale Chris A. Ciufo is editor-in-chief for embedded devices—even though Figure 1 clearly shows ARM and content at Extension Media, which includes the AMBA. Since a programmable RISC processor is part of EECatalog print and digital publications and the “all programmable” and “smarter” mantra, it’s likely an website, Embedded Intel® Solutions, and other ARM strategy will be revealed in 2014. Perhaps we’ll see a related blogs and embedded channels. He has 29 Zynq UltraScale device. years of embedded technology experience, and has degrees in electrical engineering, and in materials science, “Smarts” and Vivado Tools emphasizing solid state physics. He can be reached at cciufo@ The UltraScale mantra is based on “all programmable” and extensionmedia.com. “smarter systems.” We’ve discussed some of the technology and architecture issues above. It’s the design software that brings it all together and capitalizes on UltraScale.

www.eecatalog.com/fpga 5 Special Feature

FPGA Highs and Lows (Density, That Is) FPGAs are spanning a much broader array of applications these days, from ultra-low-power mobile devices through demanding “heavy iron” designs.

By Cheryl Coupé, Managing Editor

A recent market report by Transparency Market Research acceleration, command and control and time-critical parallel forecasts healthy growth in the FPGA market, from USD processing are the bread and butter FPGA functions within 5.08 billion in 2012 to USD 8.95 billion by 2019—a CAGR these low-power and ultra-low power applications. At the end of 8.5% from 2013 to 2019. Drivers include LTE adoption of the day, these are the traditional functions that FPGAs and the need for high bandwidth in wireless networks, as have always supported. What is being augmented to the well as the use of FPGAs in mobile devices such as smart- traditional role of FPGAs, and what’s really exciting for us phones, tablets, and growing electronic content in vehicles. at Lattice, is the fact that mobility is becoming the hotbed Our roundtable panel addresses the full spectrum of FPGA of innovation, which means battery power is the key design applications, challenges and new technologies. Thanks to constraint for many of today’s new products and applications. Dave Myron, senior director of FPGA product management and marketing for Xilinx; Joy Wrigley, senior product line By the nature of the low- and ultra-low power applications manager of ultra-low density products at Lattice Semicon- our customers are building, Lattice is seeing our devices ductor; and Paul Karazuba, senior product marketing and used in completely new functions that few had ever envi- media manager at QuickLogic. sioned before. One of the most recent trends is in the area of context-aware, always-on applications using multiple sen- sors. For example, data from a gyroscope, accelerometer and magnetometer might be used together in a smartphone or EECatalog: How are today’s FPGAs being implemented in tablet to create an inertial navigation system. low- and ultra-low-power applications? Are I/Os the choke point for these designs? Also, by having incredibly small FPGAs we are finding our- selves in completely new applications that value space as Dave Myron, Xilinx: FPGAs are designed into almost well as power and cost. As an example, if you look at a gas every conceivable application and it is rare that or water meter, it’s an incredibly tiny, complete system that power alone is the primary design concern. Although needs to successfully complete multiple functions, including power considerations have indeed been moving to connectivity, and there’s no room for 10 or 15 chips that the forefront as one of our customers’ key design concerns, would typically support this. You need a product that can performance, footprint, BOM cost and system integration all do a little bit of everything. There are areas throughout the weigh in with their own level of importance. The weighting embedded space that are getting smaller and more thermally for all of these design characteristics varies by the overall constrained and Lattice is unique to solving these problems. objectives of the end application and the design team’s goals. It would be hard to say that any one attribute—such as I/ As far as the I/0 is concerned, it depends. Power, size and Os—serve as the choke point for designs where power con- functionality are really the choke points we see. We offer a sumption is the primary concern. For power-sensitive DSP wide range of I/O, so this isn’t always an issue for the sockets or compute-intensive designs, I/O counts aren’t really we’re filling. a factor. Conversely, I/O-bridging designs can indeed be pin-limited. Power alone usually plays less of a Paul Karazuba, QuickLogic: Today’s low and ultra-low- role with respect to I/O pin limitations. power FPGAs are being utilized greatly for bridging and I/O interfaces such as I2C, SPI, UART, SDIO and Joy Wrigley, Lattice Semiconductor: Across the local of various application processors. Low-power many markets, I’d have to say that Lattice FPGAs FPGAs are uniquely suited for the expansion of interfaces for are taking on new functions and applications that connectivity and in some cases, co-processing. I/Os remain are being defined by the innovation and new ideas a choke point for designs, specifically in mobile form factor generated by mobile. Bridging, I/O expansion, hardware designs, although a larger factor when using low-power FPGAs

6 Engineers’ Guide to FPGA and PLD Solutions 2014

ew14_212_7x276_2_USA_FPGA_Gipfel.indd 1 20.11.13 11:47 ew14_212_7x276_2_USA_FPGA_Gipfel.indd 1 20.11.13 11:47 Special Feature

is finding a balance between having enough logic for the EECatalog: How do MIPI and other standards fit into FPGA required application and fitting in the mechanical size desired. vendor roadmaps?

Myron, Xilinx: While MIPI was first introduced several years ago primarily as a connectivity interface for mobile devices, EECatalog: Can low-power and ultra-low-power FPGAs offer it is clear that this interface is now gaining broader adoption competition to MCUs for sensor fusion? across a much wider range of markets and applications. MIPI’s lower signal-swing technology—optimized for short-reach Myron, Xilinx: FPGAs and all-programmable SoCs (Zynq) connectivity—provides an attractive lower-power alterna- offer significant competition to MCUs today in the area of tive for very high-bandwidth interfaces in many applications sensor fusion. As an example: advanced driver assistance as long as the signals aren’t required to travel long distances. system (ADAS) platforms based on Xilinx Zynq All Program- For this reason, we are now seeing a range of devices and mable SoCs fuse multiple imaging and LIDAR sensors to sensors ranging from ASSPs targeting communications perform a wide range of functions that include lane depar- markets to 4K/2K camera sensors all adopting MIPI con- ture warnings, pedestrian detection and calculating vehicle nectivity. It’s not uncommon to see a connectivity standard distance. The performance require- that was originally developed for ments for these tasks far outstrips one market see broader adoption the computational capabilities of The trick is to pick the tasks across multiple markets. The R&D today’s MCUs and requires the sig- and functions that require the leverage that is derived from using nificantly higher pixel-processing an industry standard versus a at rates that can only be supported fastest response time, zero unique, proprietary standard can by the parallel processing engines be quite compelling. For example, made possible by the resources latency and really low power— low-voltage differential signaling found in programmable logic. (LVDS) was originally defined for stick those in the FPGAs. the notebook PC market to allow Wrigley, Lattice: Absolutely. If for efficient, low-EMI transport of you take a step back and look at video signals from PC motherboard what’s needed in those applications, both can exist very well to LCD screen in laptop computers. Today, the LVDS inter- as stand-alone options, or complementary options. Again face is employed in almost every conceivable market. Today’s it gets back to what you are trying to accomplish. When it FPGAs already support MIPI signaling and you can expect comes to acceleration/expansion/bridging, the FPGA always that they will continue to support MIPI and a whole host of lends itself exceptionally well. You may be able to squeeze emerging and evolving standards. I/O programmability and everything into a processor, however if you’re looking for the ability to support multiple signaling standards is a unique size, power and performance, an FPGA can complete a fixed and fundamental FPGA capability that will continue to be number of tasks superior to a processor. It’s like the processor critical to the success of FPGAs across broad markets. is a generalist, but an FPGA can do functions exceptionally well. No one size fits everything. Wrigley, Lattice: At Lattice, we understand bridging, expan- sion, acceleration and so forth are fundamental applications You can have the FPGA do everything, or the processor do that bring value to our customers. All standards including everything, or they can share the load. Each of these sce- MIPI are central to the capability to offer complete solutions narios has their benefits. The trick is to pick the tasks and with our support of our entire ecosystem that includes IP functions that require the fastest response time, zero latency vendors, board vendors, EDA vendors, distributors and so and really low power—stick those in the FPGAs. The trick is on. MIPI in particular is interesting because it’s an evolving to do a little as possible in the processors in order to avoid standard that’s coming out of the mobile space and being thermal, power and I/O limitations. applied across many markets. Given our exposure to the con- sumer mobile space, Lattice is uniquely positioned to provide Karazuba, QuickLogic: Absolutely. Ultra-low-power FPGAs, MIPI solutions involving DSI and CSI-2, for example, across like those from QuickLogic, can perform all the sensor a wide variety of non-consumer applications. At the end of fusion functions of MCUs at a fraction of the power con- the day, we excel at bridging new interface standards such sumption (1/30 in some cases). This can be done without any as MIPI to traditional standards with the lowest size, power, negative effects on contextual awareness, as well as (most cost and performance. importantly) user experience. Karazuba, QuickLogic: Interface standards will always maintain a high importance on vendor roadmaps, as I/O connectivity remains an important function for FPGAs. However, MIPI, due to its high speeds and requirements for

8 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature

PHYs, may not make the most sense to be done in program- kets and applications in a relatively short period. Well before mable fabric versus hard logic. the introduction of Zynq, system architects grappled with the challenges associated with two-chip solutions that mar- ried an embedded processor to an FPGA. System architects prefer to solve their problems using only one device if at all EECatalog: The IP ecosystem is necessary in FPGAs used as possible for reasons including BOM cost, design simplifica- bridges between standards such as MIPI, PCIe Gen 1 and 2, tion and footprint. Additionally the system architect would etc. What’s happening in the ecosystem to support this? like to be able to seamlessly partition their design between hardware and software making performance/area trad- Myron, Xilinx: We continue to see an-ever expanding eco- eoffs throughout the design cycle. If at all possible, system system of IP suppliers committed to supporting current designers would also like to view hardware accelerators as and emerging connectivity standards. The market for IP memory-mapped functions without any consideration or need suppliers that support a wide range of functions including for understanding the underlying programmable logic used connectivity continues to grow as FPGAs continue to to implement these accelerators. Ideally, changing the design expand their reach into a growing boundaries between hardware and range of applications. software should not mandate PCB Ultra-low-power FPGAs… can re-design. Through a combination Wrigley, Lattice: An FPGA vendor perform all the sensor fusion of the Zynq All Programmable by definition needs to work with SoC—which combines a dual-core many technology providers. As I functions of MCUs at a fraction ARM Cortex-A9 MPCore processor mentioned earlier, IP is just one with high-performance Xilinx component of the ecosystem for us of the power consumption 7-series programmable logic— as we offer complete solutions that and the supporting Vivado Design include boards, reference designs (1/30 in some cases). Suite—which includes high level and of course development tools. synthesis, HLS, which complies C language to FPGA gates—system From the Lattice perspective, we see the appropriate eco- architects can now achieve the above design goals in the ideal system support as critical for enabling us to support the world. The Zynq SoC’s potent combination of processors, rapid development cycles customers are facing. We see three silicon and software opens up the world of high-performance different activities that are very important. processing using All Programmable SoCs to embedded programmers through a wide range of embedded operating 1. Providing IP, often free, that supports interfaces and systems including Linux. standards that have to be tightly coupled with our silicon, such as MIPI or PCIe. Karazuba, QuickLogic: With the addition of new resources and better development environments for Linux on ARM/ 2. Participation in key standards groups such as the MIPI Alliance. FPGA SoCs, it is expected that embedded designs will con- tinue to see faster system responsiveness and better use of 3. Choosing critical functions and finding partners for real-time applications. Time-critical applications such as implementing the IP. disaster monitoring, spatial awareness and precise gaming will be more readily available. Karazuba, QuickLogic: The rapid speed of development of mobile devices has allowed some IP for mobile-centric inter- faces such as MIPI to become available quicker than in the past. However, that rapid development can lead to situations EECatalog: Intel’s X1000 “Quark” SoC is an x86 low-power where IP, once openly commercially available, may already be core that can be instantiated in Intel’s (and theoretically, a generation or more behind the leading-edge developers. other’s) design rules. If available, would it be a probable core to include in an FPGA as an alternative to ARM such as in Zynq?

Myron, Xilinx: While processor size, performance, power EECatalog: Recent ARM/FPGA SoCs have opened up new efficiency are important factors when considering the poten- opportunities for Linux programmers. What changes do you tial for integration in an FPGA, equally as important is the expect to see in embedded designs to take advantage of this? supporting ecosystem, which not only includes the breadth of the ecosystem but also the alignment of the ecosystem Myron, Xilinx: The introduction of the Xilinx Zynq All Pro- to the embedded markets where all-programmable SoCs are grammable SoC has been a key industry innovation that has being employed. While Intel is clearly making strides with already seen very wide adoption across a wide range of mar- the introduction of Quark, ecosystem support and the align-

www.eecatalog.com/fpga 9 Special Feature

ment of the ecosystem will play a key role in consideration devices, which contain as many as two million logic cells, as a candidate for integration. While Xilinx cannot speculate require ASIC-strength design tools to place and route large on Intel’s future plans for the Quark synthesizable processor, designs and to achieve requisite quality of results (QoR). The others have: http://seekingalpha.com/article/1792672-intel- silicon, software, IP and technology investments required wont-build-arm-mobile-chips to deliver today’s high-density programmable-logic devices can only be realized through leading-edge (or bleeding-edge) Wrigley, Lattice: We expect FPGAs to continue to coexist with process technology including stacked silicon interconnect processors for some time to come. One of the most common technology (SSIT) and require financial commitments and applications that our FPGAs are used for is I/O expansion technical acumen that dissuades new players from entering and hardware acceleration for MCUs and microcontrollers. these markets. Our view is that the industry has yet to define a good inte- grated architecture that combines FPGAs and processors Wrigley, Lattice: In many case they won’t because markets on a single piece of silicon. As is often the case, these solu- are diverging. What we at Lattice have learned is that high- tions are miss-optimized or not end and low-end customer needs optimized for the vast majority of are completely divergent, and as applications. We continue to look, I/O programmability and the everyone knows, you can’t scale but we don’t see the answer. We do one architecture to serve both see continued interest in soft pro- ability to support multiple high-end, low-end and ultra-low- cessors and we are happy with the end. This is particularly the case for uptick of designers using either signaling standards is a consumer and similar applications the LatticeMico8 or LatticeMico32 unique and fundamental FPGA where the FPGA power consump- open source processor cores. tion has to be in the micro-watts, We understand that customers capability that will continue and have to use low-cost, small are going to invest a lot time in WLCS packaging. It’s also the resources building their own to be critical to the success of difference between traditional proprietary architecture based on ‘heavy-iron’ applications the high- our technology, which is why cus- FPGAs across broad markets. end serve and the newest gadgets tomers can utilize our soft cores that are selling to consumers in and take them to ASIC afterwards the millions of units every day. if they want. Increasingly at Lattice we’re not seeing the other FPGAs as Karazuba, QuickLogic: If available, it is definitely probable competition as we continuously drive to the lowest cost and that the Intel X1000 SoC could be integrated with the pro- size. For the implementation of functions that we commonly grammable fabric of an FPGA. The limiting factor is being tackle, we’re seeing our customers consider Lattice FPGAs as able to fit the components together, while keeping the power more flexible alternative to micros and ASSPs. consumption, and physical dimensions within the require- ments of designers. Karazuba, QuickLogic: High- and low- density FPGAs will continue to serve mostly separate markets. High-density Providing an ARM core in an FPGA or an FPGA-based pro- FPGAs will continue to well serve the industrial, scientific cessor is fine only if it can be used effectively in the system. and aerospace markets, whereas low-density, low-power System integration involves collaboration between hardware FPGAs will continue to expand I/O connectivity and sensor and software engineers. Success lies in the tools and support hub applications in mobile devices. (provided by the hardware vendors) that enables this col- laboration to take place, and often times that means making the programmable hardware appear more like a traditional Cheryl Berglund Coupé is managing editor of EE- processor development flow. Catalog.com. Her articles have appeared in EE Times, Electronic Business, Microsoft Embedded Review and Windows Developer’s Journal and she has developed presentations for the Embedded EECatalog: How will high- and low-density FPGA manufac- Systems Conference and ICSPAT. She has held a turers compete moving forward? variety of production, technical marketing and writing positions within technology companies and agencies in the Northwest. Myron, Xilinx: High-density and low-density FPGAs focus on delivering the right mix of features, performance and power, at the right price. Productivity both through software tools and IP solutions also plays a key role in affecting the customer’s decision. Today’s largest programmable-logic

10 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature

Case Study: Location-Aware Public Transit Drives Custom I/O Controller Card Development Challenges included J1708 serial protocol, multi-vendor subsystems, balancing current and future requirements.

By Bill Brown, Avail and Patrick Dietrich, Connect Tech

The modern public transport bus is more than what you might the ITS subsystems shown here. From the upper left of the remember. There’s still a driver, money box, seats and a couple diagram in Figure 1 and moving counter-clockwise: of doors. But there are also digital signs on the outside that change depending upon the daily route, electronic audible Automated Next Stop Annunciation: GPS-based location location annunciators calling out the stops, digital passive input triggers audible announcements for next stop and other passenger counters, and security systems designed to protect announcements such as sporting events, festivals or public safety. passengers and the driver. The on-board internal signage is changed to reflect location.

Plus, in an ever-connected and cloud-based M2M world, the GPS-Triggered Headsign Change: Front-of-bus signage (and modern location-aware bus interacts with passengers using other exterior signs) change to reflect bus route. Detours and GPS input and reports telemetry and data stats back to a route changes are automatically reflected based upon GPS computer-aided bus dispatch or depot location. This electronic input but under computerized dispatch control. Special event wizardry requires multiple subsystems and vendors, several information can be remotely transmitted and updated. RF connection pipes (including analog voice), and myriad soft- ware and protocols—all interconnected on industrial vehicle Security Camera System: Standalone system locally records and LANs such as CAN 2.0b running J1939 and the mature serial may transmit audio/video, depending upon RF WAN such as cel- 2-wire SAE J1708 standard. lular or access to Wi-Fi hotspots at fuel depot or other location.

In this article we’ll describe the technical rationale for how ControlPoint Automatic Passenger Counter (APC): Collects intelligent transportation systems (ITS) integrator Avail and broadcasts real-time statistics about passengers boarding/ Technologies spec’d out two unique I/O cards from rugged leaving the bus in compliance with local transit authority and supplier Connect Tech. Federal Transit Administration regulations.

Sub-System Architecture Single Point of Login: This sub-system eases the driver’s login One card (PCI Card) uses a PCI bus to plug onto a rugged but workload by avoiding having to log into each sub-system indi- standard fanless PC located in the in-vehicle unit (IVU). The vidually. Upon start of day, driver enters employee and route I/O and protocols on the card are anything but standard. information into the onboard mobile data computer (MDC), The other card (RCU Card), located in the aptly named radio which authenticates all systems resident on the J1708 and/or controller unit box, has more specialized digital I/O but CAN network(s). Information in regards to the driver’s daily interfaces with analog radio and PA equipment. At their sim- work assignment, vehicle location and passenger load is also plest, both Connect Tech cards have to deal with inherently relayed to operations personnel. “dirty” vehicle power and industrial temperatures, plus the non-trivial nuances of already-deployed legacy hardware and Vector 9000: Installed near the driver to provide HMI GUI for software such as the J1708 vehicle network. control and monitoring of all onboard systems. Includes cel- lular communications for WAN dispatch and emergency data As shown in Figure 1, the transport bus consists of multiple transmission, this is also where the driver logs onto the sys- subsystems, all interconnected via backbone networks joined tems through interface to the single-point-of-login subsystem by the serial J1708 and/or CAN 2.0b with J1939 protocol. The using a single workload ID. All the subsystems configure and vehicle contains two separate J1708 networks: one for the sync up to the bus’s route remotely via the driver’s action. drive train functions (engine, transmission, etc.) and one for

www.eecatalog.com/fpga 11 Special Feature

Figure 1: The modern public transit bus is loaded with interconnected, intelligent subsystems. The black line represents the SAE J1708 2-wire electrical bus, or the updated CAN 2.0b network running J1939 protocols. (Courtesy: Avail Technologies.)

Inputs/Vehicle Health: Discrete inputs from various on- IVU Dissected: PCI Card board sensors or mechanical attributes, analog and digital The IVU is the primary controller responsible for overseeing inputs, and voltages mux together onto the J1708 and/or all of the ITS subsystems on the vehicle and also needs to J1939 CAN. These inputs directly feed the IVU. Drive train accept chassis I/O inputs from the drive train engine control information from engine controller, transmission, automatic unit (ECU), transmission, automatic brakes and other body brakes and so on is available via CAN/J1939 to the PCI Card control modules. on the IVU. Discrete inputs such as vehicle doors status, wheelchair lift/deploy, stop request, lift request are fed to The central hub to all the interfaced systems, the IVU houses the IVU via the integrated digital I/O (DIO) module. Vehicle an industrial, fanless Intel Atom-based PC motherboard with voltages, battery back-up and ignition status are monitored two PCI slots that is capable of operation from -30°C to +80°C. and reported from the RCU. One slot currently houses the vehicle’s bulk data wireless LAN; the other slot is for the PCI Card we’ll describe in a moment. In Vehicle Unit (IVU): This is the main controller for the The IVU runs a componentized Windows Embedded Standard Vector 9000 driver HMI. The Connect Tech PCI Card resides 7 with drivers to common UART interfaces, plus a storage here, installed into a rugged, fanless PC motherboard running function is handled by a flash-based solid-state drive (SSD) for Windows Embedded Standard 7. The IVU interfaces with all event recording. The IVU monitors chassis voltages such as the onboard systems at least via serial J1708 LAN for status and/ ignition position, battery, and brake lights. As well, the IVU or health monitoring. Wi-Fi bulk connectivity added to moth- is cognizant of the bus’s door position(s), wheel chair lift, and erboard for video, location and other large data transfers. other onboard electro-mechanical functions. The IVU tracks these inputs, plus J1708 LAN messages and wirelessly sends Communications and Radio Controller Unit: Two-way radios, an alert to the computer-aided dispatch (CAD) should levels audio files and bulk communications are transmitted wire- fall outside certain thresholds. Needless to say, the combina- lessly to the vehicle. RCU box is mounted near the driver and tion of native sensor voltages, custom I/O, serial J1708, plus contains Connect Tech RCU Card with interfaces to multiple CAN 2.0b J1939 protocols made designing the IVU hardware onboard digital and analog sensors, audio switching for radio and software challenging. channels and PA input/output, and other I/O such as ignition, odometer, and more. RCU also handles graceful bring-up/shut Connect Tech was chosen by Avail partly because no off-the- down of several real-time embedded onboard systems. shelf PCI card had the requisite I/O functions. In addition to the existing subsystem interfaces described above, Avail is planning for future ITS systems with interfaces such as wire-

12 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature less LAN, Bluetooth or cellular. So the Additionally, Multi-Tech modules— PCI Card also needed to allow for fea- although not today installed—can draw tures expansion via three Multi-Tech up to 1.75V (peak) from a 5V supply. module sites (Figure 2). Accommodating nearly 30W in future expansion, plus existing component cur- Figure 3 shows the block diagram of rent draw, lead Connect Tech to design the PCI Card. The serial module I/O for a variety of card input voltages: connector blocks are the Multi-Tech 3.3/5/12VDC. For 12VDC input, an HDD- sites: one CAN controller node is for style power connector was considered. the current implementation during the market’s transition from serial J1708 to The IVU also contains an integrated design CAN J1939, while the other two CAN that delivers isolated digital I/O channels ports are for future system upgrades. Figure 2: The Connect Tech-designed “PCI Card” that assist with health monitoring. 16 provides J1708 and CAN J1939 interfaces, along channels of isolated input and 16 chan- with three (3) Multi-Tech module sites for future An Actel ProASIC3 FPGA acts as expansion I/O such as Wi-Fi, cellular, Bluetooth, nels of isolated output are included in this glue logic, computation and protocol and so on. DIO function whose interface is provided implementation. It was used pri- by a 68-pin connector. marily because there are no off-the-shelf J1708 controllers on the market (only PHY transceivers). More importantly, J1708 Radio Controller Unit; RCU Card bus timing is difficult to get right so Connect Tech had to create The Radio Control Unit houses the card of the same name and is FPGA IP to accurately deal with the protocol and bus timing. a separate box from the IVU to allow flexible vehicle mounting locations. In one of Avail’s customer installations, the RCU is The FPGA also implements PCI 2.2 and the J1939 protocol, located in the Radio Equipment Box behind the driver, but it realized via off-the-shelf CAN 2.0b controllers resident in a could just as well be mounted under the dashboard near the Microchip PIC32. The PCI Card mechanically is a standard 4.2 driver’s two-way radio. Since transit authorities often use dif- x 6.6 inch size, and uses industrial temperature components ferent model year busses or even different bus manufacturers, that exceed Avail’s operating range (refer back to Figure 2). The the standalone RCU can be installed wherever it fits best with J1708 and J1939 interfaces use a shared Micro-D connector, access to bus signals. while the other Micro-D houses the remaining two CAN inter- faces mounted on a custom front panel/bracket designed for Primarily the RCU card provides the interface to all of the IVU box mounting. bus’s radios: voice, low-rate data and bulk data. It provides channel steering which provides input to radio(s) to change the Because of these unique vehicle inter- faces, Connect Tech was required Connect Tech Inc. - Custom PCI Design for Avail Technologies to write all the firmware and create Windows drivers for the PCI Card. PCI Connector The board can be programmed, debugged and tested via JTAG run- ning across the edge connector. FPGA PCI Core We mentioned above how vehicle power is inherently dirty in a truly harsh multi-season environment. Control Logic & Register Set This is due to alternator voltage variations, reduced battery output External to FPGA at low temperatures, unpredictable Quad UART electrical loads, and the occasional UART UART UART UART CAN Controller CAN Controller CAN Controller abrupt ignition off/on cycle. Con- nect Tech designed all the I/O to be CAN CAN CAN Serial Module Serial Module Serial Module J1708 Transceiver Transceiver Transceiver galvanically isolated from DC and/or I/O Connector I/O Connector I/O Connector Transceiver CAN & J1939 CAN & J1939 CAN & J1939 (GPS, other) (GPS, other) (GPS, other) compatible ground loops, and floating grounds. compatible compatible The J1939/CAN and J1708 inter- faces are galvanically isolated on Board Bracket Hole for Hole for Hole for the PCI Card; low speed and higher SMA SMA SMA I/O Connector I/O Connector voltage I/O signals use phototrans- Connector Connector Connector istor optocouplers. Figure 3: PCI Card block diagram. Note the PCI interface implemented in an FPGA, where J1708 and J1939 protocols are handled as well. The CAN controllers are based upon a Microchip PIC32.

www.eecatalog.com/fpga 13 Special Feature

Connect Tech Inc. - Custom RCU Design for Avail Technologies The RCU uses three DPDT relays to handle the analog audio switching described above. There are also analog gain circuits for audio signal

POWER conditioning. To make the RCU box as easily 32-Bit Low Powered Switching Regulators Microcontroller Isolated DC/DC Convertors installed as possible, the RCU Card requires Smart Voltage Monitoring and Telemetry only a single 12VDC input (nominal), though it can function from a droopy 10.6V to a max of 16V. Connect Tech designed the RCU with

Isolated Isolated AUDIO DC/DC converters to provide all onboard lower RS-232 SB-9600 ADCs CAN/J1939 GPIO Conditioning, Switching Relays digital voltages. Programmable Microphone Amplification, Bias & Attenuator Intelligent Transportation, Realized What’s most striking about the two custom

RCU EXTERNAL cards designed for Avail Technologies’ ITS MULTI-IO CONNECTOR systems is not the sexiness of the technologies used; rather, the cards implement mature and Figure 4: Radio Controller Unit Card block diagram. unexciting I/O like RS-232, CAN, SPI and GPIO. But the design elegance is making these cards channels; automatic gain control for microphone audio input work with myriad real-world transit vehicle interfaces. Creating conditioning; digital potentiometers for additional audio FPGA IP that implements the difficult SAE J1708 protocol and conditioning, and switching relays to change audio source or bus timing was non-trivial, as was coding the PIC32 to talk destination (for example: the driver using the mic for onboard CAN’s J1939. Galvanically isolating many of the signals to PA to passengers versus talking to dispatch). protect the boards (and their higher order box-level systems) demonstrates practical knowledge of But for flexibility to install the functioning, legacy systems. RCU in as many future vehicles as possible, the RCU also functions Of course, as public transit buses as an I/O hub (Figure 4). Avail further evolve with future location- Technologies wanted Connect based services, onboard contextual Tech to maximize the RCU Card’s advertising, rider app integration I/O handling to standard and and whatever other technology proprietary transit industry inter- might become available next year, faces, so the RCU also accepts A/D Avail Technologies’ systems are inputs, CAN, I2C, SPI, RS-232, already capable of future upgrades. RS-485, GPIO and more. Not all This is partly due to the thoughtful of these are implemented in the expansion capabilities built onto the diagram shown in Figure 1 and go Figure 5: RCU Card. Note the PIC32 with myriad I/O, and the Connect Tech PCI and RCU Cards. unused for now. three DPDT relays for audio switch control.

Most of these signals are realized and controlled via a Microchip PIC32 to assist with smart Bill Bown is the Lead Project Engineer on the Avail control capabilities. Like the PCI Card, signals are optoisolated Next Generation In-Vehicle system. He received a to control against power anomalies. As well, the RCU partici- Bachelor of Science in Electrical Engineering from pates in the vehicle health monitoring shown in Figure 1 and the University of Akron. Mr. Bown has more than can either switch to a battery back-up, or alert the digital IVU 20 years of hardware engineering experience and and Vector 9000 Mobile Data Computer to perform graceful continues to focus his attentions on Avail’s next OS and processor shutdown if needed to avoid over/under- generation of in-vehicle solutions. voltage damage.

For its part, Connect Tech designed the card to meet all of these Patrick Dietrich is the embedded R&D lead at requirements (Figure 5). CAN and UART controllers are in the Connect Tech. He received a bachelor of science PIC32 and control a single CAN 2.0b with J1939 protocol that in Engineering from the University of Guelph in is also optoisolated. There are two RS-232 interfaces. Changes Canada, is an IEEE member, a licensed Profes- can easily be made to accommodate RS-422 or RS-485 for sional Engineer, and an active member in various alternate vehicle installations. There are 36 GPIO lines (some embedded consortia technical committees. optoisolated), and six 10-bit A/D channels.

14 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature

FPGAs Enable a New Class of Always-on Applications New ultra-low-power FPGAs provide system designers the flexibility to create completely customizable low-power sensor-management solutions to bring new classes of applications to life.

By Subra Chandramouli, Lattice Semiconductor

There’s been a pronounced use of sensors in consumer electronic products over recent years as manufacturers compete with each other to offer exciting functions that make devices more user-friendly. In fact, it’s common to find about a dozen sensors in any given smartphone today, and with smartphone shipments growing rapidly, the sensor market is witnessing unprecedented growth. The emergence of the Internet of Things (IoT), wearable computing, next-generation mobile health equipment and wireless sensor networks—all made possible in part by the growing use of sensors—will no doubt exponentially increase sensor use even further. By some estimates, the sensors market Figure 1: Using an FPGA for gesture recognition alone is expected to be a 16B unit market by 2016 (Source Yole). The types of applications using sensors are Always-on Gesture Recognition varied and disparate, but the common theme across all mar- Although smartphones have incorporated sensors for several kets is the requirement to use sensors in an always-on mode, generations now, few use these sensors to their full potential thus necessitating the use of ultra-low-power components. because of power consumption. For example, if the accelerom- eter, magnetometer, barometer and gyroscope are used in an High-end smartphones typically include several sensors always-on fashion to augment the GPS sensor, one can create such as barometers, thermometers, gyroscopes, multiple a dead-reckoning application without manual activation by cameras, touch, accelerometers, magnetometers and GPS, the user. This would be useful to create an indoor navigation just to name a few. Smartphones of the future will likely have application. While all the components already exist in many even more sensors with more valued-added features, such as smartphones to perform this function, using sensors in an the monitoring of human vital signs as one example. With always-on mode and computing the data to create an inertial this increasing use of sensors in a single device, high-end navigation system is extremely power-hungry. Such a solution smartphones and other products are designed with dedicated would drain the smartphone battery within hours. hardware to manage sensors, thus creating an independent sensor subsystem. Independent sensors operating in isolation However, even with the power limitations of today’s smart- provide interesting data; however in some situations it is the phones, several always-on features could still be implemented fused data from multiple sensors that make the data useful using ultra-low-power FPGAs. to create compelling applications. For example, data from a gyroscope, accelerometer and magnetometer are together Figure 1 illustrates an implementation of a ultra-low-power used to create an inertial navigation system. In addition to FPGA used as a companion chip to the host, (application pro- managing all sensors, the dedicated sensor manager chips cessor or sensor manager MCU ) to enable always-on gesture (typically an MCU) also perform sensor-related tasks such recognition. Independent I2C masters may be used to manage as fusion. However, MCU-based sensor managers most often different sensors to ensure the right throughput is achieved consume too much power to be used in an always-on mode, from each sensor. The always-on sensor manager may be limiting the use of sensors. designed to continuously monitor the accelerometer to detect specific patterns. After a predefined gesture has been detected, the always-on sensor manager can interrupt the host to take

www.eecatalog.com/fpga 15 Special Feature

passed along to the application layer to create the use case.

Always-on Pedometer Pedometers are very popular among health enthusiasts. Since the smartphone is physically with the user through the entire day, it would be natural to embed a pedometer within a smartphone. Most Figure 2: Context awareness using a low-power FPGA pedometer applications available on smartphones today drain the battery too quickly to make it useful. The power issue can be overcome by using an ultra- low-power FPGA to implement the entire pedometer solution in a single chip.

Figure 3 shows how an ultra-low-power FPGA can be used to implement a pedometer solution in a single chip. In this illustration, the FPGA and the Figure 3: Pedometer using low-power FPGA accelerometer are used in an always-on mode and the entire pedometer logic is implemented within the FPGA. The acceler- action or enable other sensors, for example. This design approach ometer output that is noisy is first filtered and this output is would enable the host to be in standby until a valid gesture has passed along to a step counter block. Embedded RAM within been detected and thereby reduce total system power. the FPGA may be used to keep count of the steps. The host can poll the FPGA as required to read the number of steps taken. Always-on Context-Aware Smartphone: Parameters such as calories burned, distance covered, etc. may The accelerometer along with an ultra-low-power FPGA may be computed at the application level based on the step count be used to implement a new class of always-on, context-aware information. The advantage of this implementation is that the features. Context awareness is a class of applications that sense accelerometer and the FPGA can work independent of the rest the mobile device’s environment and adapt to surroundings. of the system to enable a low power pedometer solution. For example, if the mobile device can detect that it has been placed on a table, it can then decide to stop polling the GPS Conclusion sensor and thereby save total system power. While there are Ultra-low-power is a fundamental requirement for sensor many interesting use cases for context-aware devices, power management across multiple markets. The use of always-on consumption has kept this trend at bay. In order to implement sensors in smartphones is going to create innovative appli- power-efficient context-awareness in a mobile device, it is the cation and use cases over the next few years. In other areas context-change detection that needs to be performed in an such as the Internet of Things and wireless sensor networks, always-on fashion. Once the change is detected, it can then be sensors and actuators are being used extensively to create classified into a pre-defined state like in hand, on a table, in a compelling application. New ultra-low-power FPGAs provide car and so forth. system designers the flexibility to create completely customiz- able low-power sensor management solutions, bringing new Figure 2 illustrates the use of the ultra-low-power FPGA used in classes of applications to life. an always-on mode to implement context awareness in a mobile device. Here, the FPGA and the accelerometer are always-on— i.e., in active mode. On the host side, the FPGA is interfaced Subra Chandramouli is a senior marketing man- to either the application processor (AP) or the sensor manager ager at Lattice Semiconductor. He has been MCU chip using a standard SPI bus. The FPGA interfaces to working in the mobile handheld industry for the the accelerometer using an I2C bus and constantly reads its last 15 years. Prior to joining Lattice, Subra was output. This data is buffered within the FPGA and passed on a successful entrepreneur and has worked at to the context-change detector. The intelligence required to Cypress semiconductors and Intel Corporation. detect a context change is implemented within the FPGA along Subra has one patent and 4 patent application on file all in the with the interrupt logic to wake up the host when a change is field of flash memory design. He has an MBA from Duke Univer- detected. The host can then read the buffered accelerometer sity, NC and a M.S.EE from Wright State University, OH. data to classify the context change into relevant classes such as on table, on hand, etc. Finally, the device context may be

16 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature Solving Signal Conditioning and Processing Challenges in Multi- Sensor Applications with FPGAs MicroTCA is a low-cost, open-standard architecture increasingly being adopted in mil-sensor processing, RADAR, network security and other applications utilizing FPGAs.

By Ian Shearer, VadaTech, Ltd

Many mil/aero and other applications require the data from multiple sensor inputs to feed into highly connected FPGA clusters. In many of these applications, there is a desire to minimize the size, weight and power (SWaP) utilized, and this is particularly the case if the system is in an air- borne platform. With budget cuts, even mil/aero projects are demanding more cost-effective solutions. So how does one find a lower-cost, compact solution that can handle the signal conditioning, pre-processing and post-processing/ formatting in the FPGA cluster?

MicroTCA-based FPGA Cluster MicroTCA is a low-cost, open-standard architecture increasingly being adopted in mil-sensor processing, RADAR, network security and other applications utilizing FPGAs. The advantages are high performance, a vast eco- system of off-the-shelf products from multiple vendors and low cost in a proven architecture. Since MicroTCA is truly COTS and utilized in many bleeding-edge telecom/ networking applications, designers benefit from the latest technology advantages and economies of scale from numerous markets.

For a mil-sensor application, let’s take a 6-slot cube (12.25” tall x 5” wide x 9” deep) as an example. See Figure 1 for an application-ready platform example. For high-throughput Figure 1: An application-ready platform for mil-sensor processing with signal conditioning and processing, up to five Virtex-7 5 FPGAs, 1 PrAMC, integrated AC power and shelf management. FPGAs can be utilized. It is important that each module is supported with significant local memory for intermediate networking tasks. See Figure 2 for a model showing the data storage with high-bandwidth access. Since there is interaction between the AMC modules in the system. a large amount of data sharing between the FPGAs, they need to be interconnected by high-bandwidth links for Solving Inherent Challenges low-overhead data transport. Two 10GbE ports can be used Packing this much processing power in a small size is with channel bonding to provide an aggregate output rate bound to create challenges, including thermal manage- of up to 20Gbps. The data input with specific sensor inter- ment. Dissipating heat from power-hungry FPGAs can be faces would come via the FPGA mezzanine cards (FMCs) a hurdle. Having the option to support full-height (6 HP) which plug into the advanced mezzanine cards (AMCs) of AMCs is an advantage, allowing larger heat-sinks than MicroTCA. In this example, the post-processing and data would be possible with mid-height (4 HP) modules. In this flow management to the network is best handled via a host chassis example, a design trick was to tilt the AMCs by a general-purpose processor (GPP). An octal-core QorIQ pro- few degrees. This small tilt assists airflow in the entry and cessor such as a P4080 can do the trick. It provides strong exit plenums (in a push-pull configuration with front-to- processing capability and includes hardware acceleration for rear airflow) and is vital in fitting the required thermal performance into a 7U chassis. Failure to incorporate this

www.eecatalog.com/fpga 17 Special Feature

to Virtex-7 for example). This allows users to balance processing power against cost and thermal constraints.

Versatility One of the nice things about MicroTCA is that the specification defines the pinout for the options of GbE, PCIe, SRIO or XAUI interfaces. These ensure interoper- ability across platforms and from multiple vendors. In addition to the base (GbE) and switched fabric (PCIe, SRIO or XAUI) inter- faces, the MicroTCA standard incorporates an ‘extended options’ region that supports high-speed, point-to-point connections. These pins can be routed to Aurora con- nections on the Virtex-7 FPGAs housed on the AMCs. The connectivity provided lends itself to a high-bandwidth processing pipe- line where a level of data reduction happens in the initial processing stage.

Another tool that makes life easier in development is the JTAG Switch Module. It provides a direct plugging option to ease Figure 2: This diagram shows the general processing chain using FPGA programming. Also, having integrated AC or DC VadaTech’s AMC515 Virtex-7 FPGAs and AMC713 Freescale P52020 power options makes development easier. processor module.

With a wealth of FPGAs in the AMC form factor for One of the nice things MicroTCA, there are a wide range of capabilities available at various price levels. The FPGAs accept FMCs per VITA 57, about MicroTCA is that the adding to the mix dozens of vendors with all types of solu- tions. Utilizing an application-ready platform for sensor specification defines the processing can be a time-saving step to help a designer pinout for the options of GbE, with their solution. PCIe, SRIO or XAUI interfaces. www.vadatech.co.uk

www.vadatech.com

solution would have required the chassis size to increase to 8U for equivalent cooling performance. Ian Shearer heads up VadaTech’s UK subsid- iary, managing sales and technical support Another concern in the FPGA cluster is stressing the for the EMEA region. He moved to VadaTech system host while performing the signal conditioning from Mercury Computer Systems, where roles and processing. To avoid this issue, a typical solution is included sales, product management and busi- to incorporate a QorIQ or other processor into the design ness development. Previous roles, including a of the FPGA-based AMCs. This on-board ‘host’ makes the long period with BAE Systems, were primarily in R&D with module very versatile, supporting intelligent FPGA load a focus on signal processing. and initialization functionality. This can significantly reduce the workload on the system host and improve system boot times.

It is also advantageous if the FPGA-carrier AMC has options available to incorporate various levels of FPGA (from Artix

18 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature Build or Customize an Open Source Software Toolchain? While significant effort is put into designing custom hardware architecture, it is equally important that developers have a highly complementary toolchain that leverages the unique properties and capabilities of the end device.

By Anil Khanna, Mentor Graphics

From mobile devices and home appliances to inside the Figure 1: In existence for over 25 automobile, embedded technology has ushered in a new years, the GNU Project continues era of powerful and interactive electronics. This evolution to offer popular choices for is facilitated by innovation of the hardware architecture toolchain components. and supported by developments in open source software, such as GNU, LLVM and Linux. Most embedded hardware, nity. The GNU Project (Figure like its PC counterpart, supports an operating system run- 1) is perhaps the best example ning user applications. Increasingly, these user applications of this concept, and as a result, are defining the identity of the embedded device. Enabling more developers use the GNU tool- embedded developers to take full advantage of the target chain for their custom OSS development. hardware and develop cutting edge applications is key to help differentiate one’s offerings. To do this successfully requires What’s a Toolchain, Anyway? the right set of tools. To answer the question whether it’s best to build a toolchain in-house, let’s first look at what constitutes a toolchain. A Differentiator to Success: the SDK Afterwards, we’ll look at the inherent challenges that arise The comprehensive software development kit (SDK) offered during the validation process. The term toolchain literally by a semiconductor provider or systems company is a critical means a “chain-of-tools” wherein the output of one serves element to ensure that embedded developers can success- as the input to the next tool in the process. This is the same fully adopt a hardware platform. So what is a SDK exactly? with any embedded development toolchain. A SDK normally consists of an application development environment with optimized libraries, numerous reference The GNU toolchain is more than just a compiler, though. designs, a simulator and ample documentation. All of these Hence, before building your toolchain, it’s important to components together help developers easily build quality identify the various components. For embedded software applications for the target platform. The development envi- developers, a complete toolchain generally includes: ronment or the toolchain is at the heart of the SDK. • C/C++ compilers While significant effort is put into designing custom hardware architecture, it is equally important that developers have a • Assembler highly complementary toolchain that leverages the unique properties and capabilities of the end device. Developing • Linker a toolchain in-house is not a trivial task and the expertise required to first build it, and then maintain it over the years, is • Runtime libraries often out of the purview of most companies—and beyond the core competencies of even the most experienced developers. • Debugger

At some point a company must ask whether it’s better to build • Debug stub(s) in-house, or purchase a commercially available toolchain. Because of the popularity and availability of a host of open • Integrated development environment (IDE) source software components, this article focuses on open source tools and corresponding components. Multiple choices exist when selecting and configuring each of these components. The decision is typically dependent upon, The concept of open source is founded on the idea of creating and specific to, factors such as your target architecture, sup- and openly sharing open source projects with the commu- ported language, user preferences, etc.

www.eecatalog.com/fpga 19 Special Feature

Building an In-House Toolchain users to install the toolchain, place the tools in the PATH, Once the type and version of components have been selected, and to manage uninstallation. On GNU/Linux, you may the steps toward building an in-house toolchain can turn want support for Red Hat Package Manager (RPM) packages, into a rather complex process. Essentially, the steps include: which are used on Red Hat Enterprise Linux and SuSE Enter- prise Linux. • Create your build environment Challenges Validating your Toolchain • Obtain the source code After the toolchain is built, you should validate it before deploying into a production environment. In general, the fol- • Make decisions about configuration options lowing techniques are used to test a compiler:

• Understand and perform the build process • Compiling programs and running them in the target environment. This verifies that the generated code behaves as intended. • Package the toolchain You can also use this approach to test the performance of the code generated by the toolchain. Every step of the process requires careful consideration as each decision can influence the next step and the overall • Compiling programs and inspecting the generated object file quality of the resulting toolchain. To begin, creating a build or executable image, without running the generated code. environment requires the selection of the right tools, aka This technique can be used to check that the generated code “build” tools. It is beneficial to build the toolchain using contains expected machine instructions, correct debugging GCC, but be aware if the the version of GCC you are using to information, conforms to a specified Application Binary build your toolchain has defects, it may incorrectly compile Interface, etc. the toolchain. • Compiling fragments of a program with multiple compilers Before you can build the toolchain, you must obtain all the and linking the fragments together. relevant source code. You should determine which version of This checks that the compiler you wish to validate interoper- each component will suit your needs. Also, ensure that the ates with a known-good compiler for the target platform. version you select contains support for your target hardware and that it’s compatible (at the source and binary level) • Compiling invalid program fragments and checking for with any software you’ll be incorporating into your project. appropriate error messages. Remember that in the free and open source software com- Often called “negative testing,” this technique checks that munities, version numbers do not mean the same thing as the compiler is correctly enforcing constraints. they do in other environments. For a proprietary software package, a particular version number or build number indi- There are a number of additional techniques you should apply cates the binary in use. In contrast, a version of GCC (such as to other components of your toolchain. Interactive testing 4.1.1) only denotes a version of the source code. The binary of the components, including the debugger and the IDE, are code generated from the source code depends on the configu- important check points in the validation process. ration options selected and the build environment used. Finally, there are a variety of testsuites available to perform Typically, a “configure script” is used to select the target plat- the various validation steps described above. Three of the form and specify which of the available features to include in most useful testsuites are: your toolchain. Other factors to consider include what CPUs will be used on your target systems, and will both big-endian • GNU regression testsuite: Based on the DejaGNU frame- and little-endian code be required? work, provides test coverage for GNU extensions to the C and C++ programming languages and the wide variety of The build process is itself complex, especially for a GNU/ features available in other tools. Linux toolchain. While the GNU coding standards suggest that all packages should be built with a uniform “configure, • Conformance testsuite: Used to validate the behavior of make, make install” sequence, there are variations on that your toolchain relative to published specifications. For theme in the various toolchain components. In general, it is example, the Plum-Hall C and C++ Validation Suites are necessary to provide additional options to each of the three comprehensive tests for conformance to the C and C++ pro- phases in order to obtain optimal results. gramming language specifications.

Once you’ve built your toolchain, you will want to package • Performance testsuite: Designed to provide data about the it for easy installation throughout your organization. On speed at which the code generated by the toolchain executes. Microsoft Windows, a graphical installer makes it easier for These testsuites can help to identify mis-configurations of

20 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature

the toolchain, such as situations in which the generated code is using software floating- point on a system that supports hardware floating-point.

Having gathered data about which tests pass and fail, you must now evaluate the results. The GNU toolchain (like all toolchains) has defects. Therefore, in evaluating the toolchain you just built, you should attempt to deter- mine whether or not the failures are unique to your toolchain. When you have fixed the problem, you should consider contributing the change to the public source repository for the affected component so that others can benefit from your improvement, just as you have benefited from theirs. Figure 2: A commercial solution such as Sourcery CodeBench offers an out-of-box and fully validated toolchain along with a graphical IDE. Mentor Embedded also works with Building a fully validated toolchain is a sig- silicon companies and their partners, developing custom open source toolchain projects nificant milestone but it also signals the onset such as GNU and LLVM. of the next stage—maintenance. Dedicated resources are typically required to closely monitor the devel- within the existing Microsoft Visual Studio environment. opments in the open source community and to integrate any NVIDIA wanted to augment the current Windows ecosystem new enhancements back into your in-house toolchain. Of with Android app enablement—without disturbing the course, this also means the re-testing and re-validation of work area Tegra developers are comfortable using. The last your new toolchain. thing the company wanted was to have its developers spend valuable time on back-end toolchains, build configurations Customizing your Toolchain or arcane debug environments. The goal was to keep Tegra Although many times a toolchain built from the latest OSS developers focused on creating compelling apps that tapped components is satisfactory, there are occasions where cus- into the Tegra platform. tomization may be necessary. Customization requirements could be driven by several reasons—accommodating a novel Keeping Up with the Times hardware architecture for which an OSS solution is not avail- In the embedded space today there are an array of tools and able, or for addressing a specific developer base. environments for developers to use. For developers, the key is to be able to tap into the complexities of the hardware, to One option to customizing a toolchain is to purchase a fully “exploit the processor” and take advantage of the latest archi- validated, open source-based commercial solution. In doing so, tectures. For device manufacturers and system integrators, you still get to leverage the benefits of open source without the it’s about being able to modify or customize an OSS toolchain overhead and risks associated with a “go it alone” mentality. with as little effort and capital as possible. The trends in the industry indicate that a high-quality SDK is the secret weapon When identifying a toolchain partner that might assist with when facing many of these challenges. Further, many of the a customized toolchain, it’s important to consider the fol- larger companies are relinquishing their in-house toolchain lowing criteria: in favor of a commercial toolchain that has the inherent ability to be modified or customized as hardware complexity • Current experience and expertise in open source software increases its upward trajectory.

• Industry adoption and validation of the solution Anil Khanna has over 15 years of technical and product marketing experience with a background • Available technical support and end-solution customization capabilities in both design automation tools (EDA) as well as programmable logic hardware design. Anil is • Breadth of embedded open source solution offerings currently senior product marketing manager of Mentor Embedded Sourcery Tools. Prior to mov- Just recently, Sourcery custom solutions worked with ing into Mentor Embedded, Anil was responsible for worldwide NVIDIA to deliver technology for the NVIDIA Nsight Tegra market development for Mentor’s ASIC/FPGA synthesis solutions Visual Studio Edition product, a unique development tool including Catapult and Precision Synthesis. Anil holds a Masters customized exclusively to the needs of NVIDIA Tegra devel- in electrical and computer engineering from Portland State Uni- opers. The solution enabled Android application development versity in Portland, Oregon.

www.eecatalog.com/fpga 21 Special Feature

SoC FPGA Relies on Many Cores and 14 nm Tri-Gate Process Altera’s multicore FPGA uses Intel’s tri-gate (FinFET) 14 nm 3D process technology to combine logic, four ARM A53 cores and OpenCL as a heterogeneous SoC replacement.

By Chris A. Ciufo, Editor-in-Chief

At the core of Altera’s recent Stratix 10 announcement is… multiple cores. Four ARM Cortex-A53s to be exact, plus a bunch of DSP blocks, more logic than ever before done by the company, double the on-chip clock rate and software designed to create a multicore system instead of just a dense FPGA. In effect, it’s a system-on-chip (SoC) that bests the company’s recent Arria 10 SoC.

But also at the core of the 2014 next-gen FPGA family is Intel’s tri-gate (FinFET) 14 nm 3D process technology, to which Altera has the exclusive FPGA rights for 12 years. It’s Intel’s tri-gate transistors that give Altera the edge—at least on paper—in the war of high density FPGAs. Details on Altera’s Stratix 10 (the follow-on FPGA to the Stratix V) are still scarce, though the company’s second pre-announcement added more details and we can guess at others. Figure 1: Intel’s tri-gate transistors propagate Moore’s Law, increase The new multicore Stratix 10 FPGA SoC will tape out in 2014 silicon density, and reduce power. and may possibly ship by late 2014. Let’s take a look at some of the elements that make up the Stratix 10’s performance goals.

Tri-Gate’s Secret Sauce Altera is relying heavily on four factors to achieve the aggres- sive specs for Stratix 10: a refined architecture, the ARM Cortex A53, software tools such as OpenCL and Intel’s tri- gate 14 nm process.

Intel has consistently battled Moore’s Law through new process innovations. Following two-dimensional high K (dielectric) metal process improvements at 45 nm in 2007, Intel unveiled 3D transistors in 2011 in 22 nm feature size. As oxide thickness and transistor dimensions approach mere Angstroms, the company essentially made transistors Figure 2: Altera’s Stratix 10 SoC FPGA relies on four elements to “smaller” by stacking them in layers and wrapping the silicon achieve what the company calls “unimaginable” performance. Intel’s 14 nm tri-gate process is the root of the performance discontinuity channel with three gates instead of two. This allows more from current 28 nm FPGAs. (Source: Altera.) transistors per square nanometer by using the Z axis, and they’re smaller and more power efficient. The concept of a Intel’s 14 nm tri-gate process. Altera’s CEO John Daane said non-planar double-gate transistor with a fin wrapped around at the time that he believed Intel was two to four years ahead the channel is called a “finFET,” although Intel calls theirs of the competition. (This is debatable as TSMC moves from “tri-gate” (Figure 1). 20 nm to 16 nm.) Intel is already producing 22 nm 4th gen- eration Haswell processors on tri-gate and 14 nm Broadwell In early 2013, Intel and Altera inked a deal that allowed (5th generation Core) production is scheduled for 2014. Altera to become the only FPGA company with access to

22 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature

Despite a recent delay that necessitated design rule changes to Intel’s 14 nm process, Altera as an Intel foundry customer will certainly follow shortly after Intel’s own Broadwell pro- duction. According to Chris Balough, Altera’s senior director of SoC product marketing, design software for the Stratix V will be available Q1 2014 and first silicon will tape out some- time in 2014.

“Unimaginable” Performance In June and in November, Altera began predicting some pretty big numbers for Stratix 10…so much so that the nomenclature leaped from Stratix V (available now) to Stratix 10. The company is making what Balough called a “hyperbolic Figure 3: Stratix V (current generation) DSP block. claim, something we never guessed was possible,” he told me.

processor system improvement. For Stratix 10, the magic Industry’s First Gigahertz FPGAs and SoCs again is Intel’s 14 nm tri-gate process as described in the whitepaper “The Breakthrough Advantage for FPGAs with • New ultra-high performance FPGA architecture Tri-Gate Technology” (http://www.altera.com/literature/ • 2x the core performance of prior generation high-end wp/wp-01201-fpga-tri-gate-technology.pdf). FPGAs • >10 TFLOPs of single-precision floating-point DSP Altera has provided no details yet on architecture other than performance the notional marketing chart shown in Figure 2. Fundamen- tally, Stratix 10 processing will take place in FPGA gates, four • >4x processor data throughput of prior-generation SoCs ARM Cortex-A53 processors and a series of DSP blocks. Soft- ware tools meld it all together into a powerful one-chip system. Break the Bandwidth Barrier with Unimaginable High-Speed Interface Rates DSP Blocks and ARM Cores Current-generation Stratix V devices contain two columns of • 4x serial transceiver bandwidth from previous generation DSP blocks surrounded by constellations of logic array blocks. FPGAs for high port count designs • 28 Gbps backplane capability for versatile data switching applications • 56 Gbps chip-to-chip/module capability for leading edge interface standards • Over 2.5 Tbps bandwidth for serial memory with support for Hybrid Memory Cube • Over 1.3 Tbps bandwidth for parallel memory interfaces with support for DDR4 at 3200 Mbps

Table 1: Stratix 10 performance targets. (Source: Altera.)

There will be a 2x increase (from 28 nm) in logic fabric speed to over 1 GHz. Even though density increases, the SoC device will also achieve a 70 percent power savings with a target 4x. Other technology targets are shown in Table 1.

There is credibility in these claims, albeit based upon scaling factors. The new, in-production Arria 10 SoC is based upon TSMC’s 20 nm process and is 15 percent faster than Arria Figure 4: Stratix 10 contains four Cortex-A53 cores, ARM’s most V, consumes 40 percent less power and boasts a 50 percent impressive multi-threaded 64-bit core.

www.eecatalog.com/fpga 23 Special Feature

Each DSP block (Figure 3) can be configured for up to eight 9 x 9 bit multipliers, four 18 x 18 bit multipliers and one 36 x 36 bit multi- plier. These DSP blocks run at 333 MHz and provide data throughput performance of 2.67 giga-MACs per block. The largest Stratix V device (EP1S80) has 22 DSP blocks.

With the forecasted 1 GHz fabric on Stratix 10 over Stratix V’s 333 MHz, one might expect a 3x perfor- mance increase with no logic changes. However, Altera’s Balough told me they’re expecting a 6x throughput increase for Stratix 10 to “greater than 10 TFLOPs.” It’s Figure 5: Stratix 10’s quad Cortex-A53 cores can run in 32-bit mode entirely likely the number of DSP blocks will double, RAM for upgradeability from previous generation devices. block sizes will increase and there will be some fine tuning within the blocks and routing fabric. However, it’s also L2 caches; both features pinpoint data center and high-end possible the 6x claim is taking into account other FPGA heterogeneous computing applications. To fully capitalize on resources for data movement and manipulation such as the Stratix 10’s logic, DSP and A53 resource elements, the com- ARM processors. pany is planning a suite of heterogeneous-focused tools.

An FPGA’s DSP sub-systems are a major reason FPGAs are Sweet Tool Suites chosen for high-performance algorithms like video CODECs Altera’s Chris Balough asserts that “all of Stratix 10’s pro- or data-plane processing. On the Stratix 10, there are also cessing elements are compelling enough by themselves,” but the four ARM Cortex-A53 processors. Altera press materials the “unimaginable performance” comes into play when they cite an 8x throughput improvement over 28 nm FPGAs, and work together as a system. Altera is counting on OpenCL and Altera says that ARM claims it’s the highest power efficiency the SoC Embedded Design Suite (EDS) to capitalize on the of any 64-bit processor. That’s probably not hard to argue FPGA’s heterogeneous elements. considering ARM’s leadership in mobile and low-power devices…and Intel’s struggle to catch up. The company has endorsed C-based design entry using OpenCL for several years via the SDK for OpenCL product. The four Cortex-A53 cores shown in Figure 4 are members of First demoed by Altera at SuperComputing 2012, the OpenCL ARM’s Cortex-A50 series announced in 2012 and are based design flow allows designers to work in C and easily mix logic on the ARMv8 64-bit architecture. The A53 chosen for Stratix and on-chip resources into heterogeneous multicore archi- 10 is the “little brother” to the A57, ARM’s single-threaded, tectures. The high-level design flow alleviates the typical RTL deep pipeline monster targeting servers. Both devices are coding required for most FPGA tasks. It also mimics the way designed for gigahertz performance in heterogeneous SoCs. embedded developers mix and match processing resources at For example, Stratix 10’s A53s will target applications that the board or system level. offload x86 host processors, while AMD’s HieroFalcon server accelerator 64-bit SoC (2H 2014) uses the Cortex-A57 to OpenCL code is also portable and lets designers load-level or complement AMD’s on-chip x86. performance-tune applications to take advantage of the mas- sively parallel nature of FPGAs and the Cortex-A53 CPUs. In Stratix 10’s A53s are software compatible with previous October of this year, Altera announced the SDK for OpenCL generation 32-bit ARM Cortex-A9s in Altera Cyclone, Arria is conformant to the OpenCL 1.0 standard and is listed on the and Stratix devices. This allows design sockets to upgrade Khronos Group list of conformant products. This means that to the latest FPGAs while migrating forward operating sys- Altera can “provide a validated cross-platform programming tems, unmodified software and IP cores (Figure 5). The A53 environment” designed to accelerate algorithms at signifi- can access up to 256 TB of memory with ECC for on-core L1/ cantly lower power versus other cross compiler alternatives.

24 Engineers’ Guide to FPGA and PLD Solutions 2014 Special Feature

QuickAssist Technology Capitalizes on FPGA Resources Conclusions: Altera’s Design Targets We’ve written quite a bit about Altera’s burgeoning roadmap Altera was the first FPGA company on record to support over the past twelve months. During an interview in July Intel’s QuickAssist technology, a set of instructions and 2012 with Brad Howe, Altera’s senior VP of R&D, he talked APIs that seek to offload host CPU instructions to a co- about the need for FPGAs to evolve past their growing com- processor like an FPGA. XtremeData was the board vendor plexity and focus instead on “silicon convergence” (http:// who demoed it at the 2008 Intel Developers Forum with a eecatalog.com/fpga/2012/07/17/fpgas-the-best-of-both- Xeon and Mathworks’ Simulink graphical block diagram- worlds-until-theyre-not/ ). He was of course referring not ming software. only to heterogeneous FPGA SoCs, but also on reducing design complexity with OpenCL. Altera’s Stratix 10, with faster transceivers, DSP blocks and 1 GHz fabric, is likely to be a favorite co-processor for Stratix 10, with its “unimaginable performance” made pos- Intel host CPUs like the latest 4th Generation Core devices sible by Intel’s 14 nm tri-gate process, epitomizes the future (code named Haswell) or Intel’s next-generation 14 nm of big density FPGAs. “It was an explicit choice to make tri-gate Broadwell CPUs. According to Intel, QuickAssist [Stratix 10] a heterogeneous computing platform,” said (http://www.intel.com/content/www/us/en/io/quickas- Altera’s Chris Balough. sist-technology/quickassist-technology-developer.html) is ideal for computational workloads including cryp- We’re anxious to see the company release details on the tography, data compression and pattern matching—all architecture, cite power estimates based upon notional applications that are algorithmically “heavy” and can take reference designs and detail more of the specs. As we went advantage of an FPGA’s parallelism via mixed resources. to press, Altera revealed some data on Stratix 10’s trans- ceivers (4x bandwidth and 28 Gbps backplane switching), In the case of Altera’s Stratix 10 next-gen devices, CPU and indicated that the device is 3D-capable for packaging offload via QuickAssist could also utilize high-level lan- with ASSPs and memories. guage OpenCL to construct heterogeneous “hardware macros” which balance workloads between the FPGA Also, it will be interesting to see if Altera targets Big Data fabric, DSP blocks and the decision-making capability of applications with Stratix 10 as I had predicted in June the ARM Cortex-A53. All of these resources, of course, 2013 (“Does Altera have ‘Big Data’ Communications on the work as an accelerator sub-system to Intel’s Xeon, Core Brain?” http://eecatalog.com/caciufo/2013/06/05/a-slew- and now the newest Rangeley Atom CPUs—all of which of-recent-altera-high-performance-announcements-over- contain Intel QuickAssist Technology. the-last-three-months-can-only-mean-one-thing-the-com- pany-is-targeting-big-data-in-a-big-way/#comment-20464 ). The company’s string of acquisitions and posturing sure points that way. Practically speaking, OpenCL allows Stratix 10 designers to port or create hardware accelerators to take advantage We’ll have to wait until 2014 for concrete information on the of the FPGA’s parallelism. This has always been possible Stratix 10 heterogeneous SoC FPGA and its heavy lift capabili- when coding in RTL but not necessarily practical and usu- ties. Until then, all this “unimaginable” information remains a ally not easily portable. The OpenCL SDK is complemented technology announcement. But a compelling one, indeed. by EDS which provides FPGA-adaptive debug. Altera’s Chris Balough describes EDS as “a native multicore debug environ- ment with intrinsic FPGA debug capabilities built in.” EDS Chris A. Ciufo is editor-in-chief for embedded supports real-time, in-system, whole-chip debug and visual- content at Extension Media, which includes the EE- ization, including the ARM Cortex-A53 cores. That’s because Catalog print and digital publications and website, EDS includes Altera’s existing DS-5 Altera Edition software, Embedded Intel® Solutions, and other related blogs ARM’s own development tool that’s been tailored for Altera and embedded channels. He has 29 years of em- FPGA devices. bedded technology experience, and has degrees in electrical engineering, and in materials science, emphasizing solid Both the OpenCL SDK and EDS software suites exist today state physics. He can be reached at [email protected]. for production devices, and will soon be released supporting the Stratix 10 so lead customers and early adopters can start their designs.

www.eecatalog.com/fpga 25 Microsemi Corporation

IGLOO2 FPGA Technical Specs The IGLOO®2 family is the industry’s lowest power, most reliable and highest security programmable logic solu- ◆ High-Performance FPGA tion. This next generation IGLOO2 architecture offers • Efficient 4-Input LUTs with Carry Chains for High-

up to 3.6X gate count implemented with 4-input look-up Performance and Low Power Chips FPGA table (LUT) fabric with carry chains, giving 2X perfor- • Up to 236 Blocks of Dual-Port 18 Kbit SRAM mance, and includes multiple embedded memory options • Up to 240 Blocks of Three-Port 1 Kbit SRAM and mathblocks for digital signal processing (DSP). High • High-Performance DSP Signal Processing ◆ FPGA Chips speed serial interfaces include PCI EXPRESS® (PCIe®), 10 High Speed Serial Interfaces Gbps attachment unit interface (XAUI) / XGMII extended • Up to 16 SERDES Lanes, sublayer (XGXS) plus native serialization/deserializa- • Native SERDES Interface Facilitates Implemen- tion (SERDES) communication, while double data rate 2 tation of Serial RapidIO in Fabric or an SGMII (DDR2)/DDR3 memory controllers provide high speed Interface to a soft Ethernet MAC memory interfaces. • PCIe Endpoint Controller x1, x2, x4 Lane PCI Express Core Features & Benefits ◆ High Speed Memory Interfaces • Up to 2 High Speed DDRx Memory Controllers ◆ Design Security Features (available on all devices) • MSS DDR and Fabric DDR Controllers • Encrypted User Key and Bitstream Loading, • Supports LPDDR/DDR2/DDR3 Enabling Programming in Less-Trusted Locations • Maximum 333 MHz DDR Clock Rate • Supply-Chain Assurance Device Certificate ◆ High-Performance Memory Subsystem • Enhanced Anti-Tamper Features • 64 KB Embedded SRAM (eSRAM) • Zeroization • Up to 512 KB Embedded Nonvolatile Memory ◆ Data Security Features (available on premium devices) (eNVM) • Non-Deterministic Random Bit Generator • DDR Bridge with 64-Bit AXI Interface • User Cryptographic Services (AES-256, SHA-256, ◆ Clocking Resources & Operating Voltage and I/Os ECC Engine) • Clock Sources • User Physically Unclonable Function (PUF) Key • Up to 8 Clock Conditioning Circuits (CCCs) with Up Enrollment and Regeneration to 8 Integrated Analog PLLs • CRI Pass-Through DPA Patent Portfolio License • 1.2 V Core Voltage ◆ Reliability • Multi-Standard User I/Os (MSIO/MSIOD) • Single Event Upset (SEU) Immune • Market Leading Number of User I/Os with 5G • Zero FIT FPGA Configuration Cells SERDES • Single Error Correct Double Error Detect: eSRAMs • NVM Integrity Check at Power-Up and On-Demand Application areas • No External Configuration Memory Required • Instant- On, Retains Configuration When Powered Off Aerospace/Defense, Automotive, Data Processing and ◆ Low Power Storage, Industrial Automation, Medical Imaging, Wired • Low Static and Dynamic Power Communcations, Wireless Communications • Flash Freeze Mode for Fabric • Power as low as 13 mW/Gbps per lane for SERDES devices • Up to 25% lower total power than competing devices

Contact Information

Microsemi Corporation One Enterprise Aliso Viejo, CA 92656 USA (800) 713-4113 (800) 713-4113 [email protected] www.microsemi.com

26 • FPGAs & PLDs Engineers’ Guide to FPGA and PLD Solutions 2014 Microsemi Corporation

SmartFusion2 System-on-Chip FPGAs

Microsemi’s SmartFusion®2 SoC FPGAs integrate fourth • Up to 236 Blocks of Dual-Port 18 Kbit SRAM with generation flash-based FPGA fabric, an ARM® Cortex™-M3 400 MHz Synchronous Performance processor, and high-performance communications inter- • Up to 240 Blocks

faces on a single chip. The SmartFusion2 family is the • High-Performance DSP Signal Processing Chips FPGA industry’s lowest power, most reliable and highest security ◆ Microcontroller Subsystem (MSS) programmable logic solution. SmartFusion2 FPGAs offer up • Hard 166 MHz 32-Bit ARM Cortex-M3 Processor to 3.6X the gate density, up to 2X the performance of previous • 64 KB Embedded SRAM

FPGA Chips flash-based FPGA families, and includes multiple memory • Up to 512 KB eNVM blocks and multiply accumulate blocks for DSP processing. • Triple Speed Ethernet MAC The 166 MHz ARM Cortex-M3 processor is enhanced with an • USB 2.0 High Speed On-The-Go Controller embedded trace macrocell (ETM), memory protection unit • CAN Controller, 2.0B Compliant (MPU), 8 Kbyte instruction cache, and additional peripherals, • 2x: SPI, I2C, Multi-Mode UARTs Peripherals including controller area network (CAN), Gigabit Ethernet, ◆ High Speed Serial Interfaces and high speed universal serial bus (USB). High speed • Up to 16 SERDES Lanes serial interfaces include PCI EXPRESS® (PCIe®), 10 Gbps • Native SERDES Interface Facilitates Implemen- attachment unit interface (XAUI) / XGMII extended sublayer tation of Serial RapidIO in Fabric or an SGMII (XGXS) plus native serialization/deserialization (SERDES) Interface to the Ethernet MAC in MSS communication, while LPPDR, DDR2, & DDR3 memory con- • PCIe Gen2 Endpoint Controller x1, x2, x4 Lane trollers provide high speed memory interfaces. PCI Express Core ◆ High Speed Memory Interfaces Features & Benefits • Up to 2 High Speed DDRx Memory Controllers • MSS DDR and Fabric DDR Controllers ◆ Leadership in Reliable FPGAs • Supports LPDDR/DDR2/DDR3 • SEU immune zero FIT flash FPGA configuration cells • Maximum 333 MHz DDR Clock Rate • SEU Protected Memories: eSRAMs, DDR Bridges , ◆ Clocking Resources & Operating Voltage and I/Os Instruction Cache, Ethernet, CAN and USB Buf- • Clock Sources fers, PCIe, MMUART and SPI FIFOs • Up to 8 Clock Conditioning Circuits (CCCs) with • Hard LPDDR, DDR2 & DDR3 controllers with Up to 8 Integrated Analog PLLs SECDED protection • 1.2 V Core Voltage ◆ Leadership in Low Power FPGAs • Multi-Standard User I/Os (MSIO MSIOD, & DDRIO) • Low Static and Dynamic Power o Flash*Freeze • Market Leading Number of User I/Os with 5G Mode for Fabric SERDES • Power as low as 13 mW/Gbps per lane for SERDES devices • Up to 50% lower total power than competing SoC devices Availability ◆ Leadership in Secure FPGAs • State of the art security enables root-of-trust SmartFusion2 SoC FPGA are available now applications • Radically transforms the usefulness of FPGAs in Application areas security applications ◆ Leadership in Real-Time FPGAs Aerospace/Defense, Automotive, Data Processing and • ARM® Cortex™-M3 real-time microcontroller Storage, Industrial Automation, Medical Imaging, Wired • Flash*Freeze real-time power management Communcations, Wireless Communications • Instant on real-time availability ◆ FPGA Fabric Features • 16x 5Gbps SERDES, PCIe, XAUI / XGXS+ Native SERDES Contact Information • Integrated DSP processing blocks • 150K LUT, 5Mbit SRAM, 4Mbit eNVM Microsemi Corporation One Enterprise Aliso Viejo, CA Technical Specs 92656 USA (800) 713-4113 (800) 713-4113 ◆ High-Performance FPGA [email protected] • Efficient 4-Input LUTs with Carry Chains for High- www.microsemi.com Performance and Low Power

www.eecatalog.com/fpga FPGAs & PLDs • 27 Xilinx, Inc.

Artix-7 FPGAs

The Xilinx Artix®-7 FPGA family redefines cost-sensitive solutions by providing low power and cost while deliv-

ering best-in-class performance. Designed to address Chips FPGA high volume applications that need advanced function- ality, the Artix-7 FPGAs provide the highest transceiver bandwidth, highest digital signal processing bandwidth and highest fabric performance in a 28nm low-end device, FPGA Chips as well as analog-mixed signal integration. Technical Specs

One of three families fabricated on a 28nm process, ◆ Up to sixteen 6.6Gb/s transceivers Artix-7 devices share a common architecture with the rest of the 7 series families, which include the mid-range ◆ Up to 930 GMACs of DSP bandwidth Kintex-7 family for optimal price-performance and the Virtex-7 family for maximum capacity and performance. ◆ 1.25Gb/s LVDS I/O line rates and 1,066Mb/s DDR3 System manufacturers can easily scale successful designs to address adjacent markets requiring increased ◆ Integrated analog mixed signal and low-cost wire performance and capability. bond packaging

The Artix-7 FPGA AC701 Evaluation Kit gets you quickly ◆ 2D eye scan for signal integrity analysis and auto- prototyping for your cost sensitive applications. This adaptive equalization includes all the basic components of hardware, design tools, IP, and pre-verified reference designs. This also features a targeted reference design enabling high- Availability performance serial connectivity and advanced memory interfacing equipped with a full license for the Northwest Visit www.xilinx.com/artix7 Logic DMA engine. Application areas Features & Benefits Aerospace/Defense, Automotive, Consumer, Broadcast, ◆ 50% power reduction versus the previous generation Industrial Automation, Medical Imaging, Wired Com- for portable applications muncations, Wireless Communications

◆ Highest I/O, DSP, memory, and fabric performance in its class

◆ Small footprint to meet SWaP-C requirements

◆ Scalable optimized architecture for rapid design migration

Contact Information

Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 USA 408-559-7778 Telephone www.xilinx.com

28 • FPGAs & PLDs Engineers’ Guide to FPGA and PLD Solutions 2014 Xilinx, Inc.

Kintex-7 FPGAs

The Xilinx Kintex®-7 FPGA family delivers the most bal- anced price and performance in the industry by providing

the right mix of I/Os, fabric, and embedded resources Chips FPGA while providing cutting-edge transceivers, integrated IP, and the highest signal processing bandwidth in its class. The Kintex-7 FPGA family is the ideal mid-range solu- tion delivering high end functionality in a cost effective Technical Specs FPGA Chips package. ◆ Up to 2,845 GMACs of DSP bandwidth One of three families fabricated on a 28nm process, Kintex-7 devices share a common architecture with the ◆ 12.5Gb/s backplane capable transceivers rest of the 7 series families, which include the low-end Artix-7 family for low power and cost and the Virtex-7 ◆ 1,866Mb/s memory interface performance family for maximum capacity and performance. System manufacturers can easily scale successful designs to ◆ Integrated block for PCI Express Gen2x8 and soft IP address adjacent markets requiring reduced power and for PCI Express Gen 3x8 cost or increased performance and capability. ◆ Integrated analog mixed signal The Kintex-7 FPGA KC705 Evaluation Kit accelerates development and demonstration of radio/baseband, radar, EdgeQAM, triple-rate SDI and other applications Availability for a broad range of markets that demand power-efficient high-speed communications and processing. Visit www.xilinx.com/kintex7

The Kintex-7 FPGA DSP Kit lets developers rapidly Application areas migrate to the 7 series using a platform that fosters innovative and highly differentiated solutions. Designers Aerospace/Defense, Broadcast, Consumer, Data Processing can reduce schedule risk, shorten time to market, and and Storage, Industrial Automation, Medical Imaging, Wired more quickly focus on adding unique value to solutions Communications, Wireless Communcations targeted for wireless communications infrastructure, aerospace and defense, instrumentation, medical imaging, and general-purpose data acquisition.

Features & Benefits

◆ 2X price/ performance and 50% lower power than previous generation

◆ High DSP-to-logic ration for signal processing- intensive applications

◆ High speed, backplane capable transceivers (12.5 Gb/s) in a cost sensitive solution Contact Information ◆ Scalable optimized architecture for rapid design migration Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 USA 408-559-7778 Telephone www.xilinx.com

www.eecatalog.com/fpga FPGAs & PLDs • 29 Xilinx, Inc.

UltraScale Devices

The Xilinx UltraScale™ devices extend the company’s market leading Virtex® and Kintex® FPGA and 3D IC fami-

lies to enable next generation smarter systems with new Chips FPGA high-performance architectural requirements. Technical Specs

The portfolio is based on the Xilinx UltraScale architecture ◆ Up to 4.4M logic cells at 20nm using 2nd generation 3D IC which addresses the industry’s number one bottleneck FPGA Chips at advanced nodes—the interconnect—to improve per- ◆ Up to 8.2 TeraMACs of DSP compute performance formance and device utilization while accelerating design closure. Core technologies include next genera- ◆ 33G transceivers for chip-to-chip and -optics, and tion routing, ASIC-like clocking, and logic infrastructure 28G backplanes; 16G backplane capable transceivers enhancements to eliminate interconnect bottlenecks at ½ the power while supporting consistent device utilization of more than 90% without performance degradation. Based on ◆ Integrated 100G Ethernet MAC and 150G Interlaken cores the ASIC-class advantage of the UltraScale architecture, these families are co-optimized with the Vivado® Design ◆ DDR4 memory support for maximum system bandwidth Suite and leverage the UltraFAST™ design methodology to accelerate time to market. Availability

Kintex UltraScale FPGAs deliver ASIC-class system-level Visit www.xilinx.com/ultrascale performance, clock management and power manage- ment for next generation price-performance-per-watt. Application areas These second generation devices expand the mid-range by delivering the highest throughput with lowest latency Aerospace/Defense, Automotive, Broadcast, Consumer, for medium-to-high volume applications that include: Data Processing and Storage, Industrial Automation, 100G networking, wireless infrastructure, and other DSP- Medical Imaging, Wired Communcations, Wireless intensive applications. Communications

Virtex UltraScale is the industry’s most capable high-end FPGA family, ideal for a wide range of high-performance applications requiring massive bandwidth, including: 400+ Gb/s systems, large-scale emulation and high perfor- mance computing. These devices deliver a step-function in increased bandwidth and reduced latency for systems demanding massive data flow and packet processing.

Features & Benefits

◆ Next generation routing for next generation design complexities

◆ ASIC-like clocking for performance, scalability, and power reduction Contact Information ◆ Enhanced logic cell packing improves performance and utilization while reducing dynamic power Xilinx, Inc. 2100 Logic Drive ◆ Up to 50% lower power vs. previous generation San Jose, CA 95124 USA 408-559-7778 Telephone ◆ Seamless footprint migration between Kintex and www.xilinx.com Virtex UltraScale devices for scalability

30 • FPGAs & PLDs Engineers’ Guide to FPGA and PLD Solutions 2014 Xilinx, Inc.

Virtex-7 FPGAs

The Xilinx Virtex®-7 FPGA family delivers high per- formance and capacity to address complex system requirements for a diverse set of applications that include Chips FPGA Nx100G wired communications, ASIC prototyping, high performance computing, and portable radar. Comprised of 3 different platforms that include the Virtex-7 T, Technical Specs

FPGA Chips Virtex-7 XT and Virtex-7 HT devices, the family delivers unprecedented serial bandwidth and capacity to meet ◆ Up to 2M logic cells the most challenging system integration needs. ◆ Up to 5.3 TMACs of DSP bandwidth One of three families fabricated on a 28nm process, Virtex-7 devices share a common architecture with the ◆ Up to 96 backplane-capable 13.1Gb/s transceivers rest of the 7 series families, which include the low-end Artix-7 family for low power and cost and the mid-range ◆ Up to 16 x 28Gb/s transceivers Kintex-7 family for a balance of price and performance. System manufacturers can easily scale successful ◆ Integrated analog mixed signal designs to address adjacent markets requiring reduced power and cost. Availability

The Virtex-7 FPGA VC707 Evaluation Kit is a full-featured, Visit www.xilinx.com/virtex7 highly-flexible, high-speed serial base platform using the Virtex-7 XC7VX485T-2FFG1761C and includes basic com- Application areas ponents of hardware, design tools, IP, and pre-verified reference designs for system designs that demand high- Aerospace/Defense, Broadcast, Consumer, Data Pro- performance, serial connectivity and advanced memory cessing and Storage, Wired Communcations, Medical interfacing. Imaging

The Virtex-7 FPGA VC709 Connectivity Kit is a 40Gb/s platform for high-bandwidth and high-performance applications containing all the necessary hardware, tools and IP to power quickly through your evaluation and development of connectivity systems.

Features & Benefits

◆ Virtex-7 T features 3D IC to deliver high levels of capacity and performance enabling ASIC prototyp- ing, emulation and ASIC/ASSP replacement

◆ Virtex-7 XT devices offer high processing bandwidth with high performance transceivers, DSP and BRAM

◆ Virtex-7 HT devices with integrated 28Gb/s serial transceivers are optimized for Nx100G line cards with Contact Information CFP2 optical interfaces

Xilinx, Inc. ◆ Scalable optimized architecture for rapid design 2100 Logic Drive migration San Jose, CA 95124 USA 408-559-7778 Telephone www.xilinx.com

www.eecatalog.com/fpga FPGAs & PLDs • 31 Xilinx, Inc.

Zynq-7000 All Programmable SoC

The Zynq®-7000 family is the leading All Programmable SoC with hardware, software and I/O programmability. This innovative class of product combines an industry- standard ARM® dual-core Cortex™-A9 MPCore™ SoC processing system with Xilinx 28nm programmable logic

SoC architecture in a single device. This processor-centric architecture delivers a complete embedded processing platform that offers developers ASIC levels of perfor- mance and power consumption, the flexibility of an FPGA and the ease of programmability of a microprocessor.

The devices of the Zynq-7000 All Programmable SoC family allow designers to target cost sensitive as well as high-performance applications from a single platform ◆ Dual 12bit 1Msps Analog-to-Digital converter using industry-standard tools. The tight integration of • Up to 17 Differential Inputs the processing system with programmable logic allows designers to build accelerators and peripherals to speed ◆ Advanced Low Power 28nm Programmable Logic: key functions by up to 10x. ARM architecture and eco- • 28k to 444k Logic Cells (approximately 430k to system maximizes productivity and eases development 6.6M of equivalent ASIC Gates) for software and hardware developers. • 240Kb to 3020Kb of Extensible Block RAM • 80 to 2020 18x25 DSP Slices (58 to 1080 GMACS Unlike ASICs and ASSPs, Zynq-7000 devices allow peak DSP performance) designers to modify their design throughout the devel- opment phase and after the system is in production. In ◆ PCI Express® Gen2x8 (in some devices) addition, the Zynq-7000 All Programmable SoC family, with over 3000 interconnections between its processing system ◆ 154 to 454 User IOs (Multiplexed + SelectIO™) and the programmable logic, offers levels of performance that two-chip solutions (ASSP+FPGA) cannot match due to ◆ 4 to 16 Transceivers (up to 12.5Gb/s, depending on limited I/O bandwidth and limited power budgets. device)

Features & Benefits AVAILABILTY

◆ Dual ARM® Cortex™-A9 MPCore™ Visit www.xilinx.com/zynq • Up to 1GHz performance • Enhanced with NEON Extension and Single & Application areas Double Precision Floating point unit • 32Kb Instruction & 32Kb Data L1 Cache Automotive, Broadcast, Medical Imaging, Industrial, Aerospace & Defense, Wired and Wireless Communi- ◆ Unified 512kB L2 Cache 256Kb on-chip Memory cations, Consumer

◆ DDR3, DDR3L, DDR2 and LPDDR2 Dynamic Memory Controller

◆ 2x QSPI, NAND Flash and NOR Flash Memory Controller Contact Information

◆ 2x USB2.0 (OTG), 2x GbE, 2x CAN2,0B 2x SD/SDIO, Xilinx, Inc. 2x UART, 2x SPI, 2x I2C, 4x 32b GPIO 2100 Logic Drive San Jose, CA 95124 ◆ AES & SHA 256b encryption engine for secure boot USA 408-559-7778 Telephone and secure configuration, and RSA-2048 for asym- www.xilinx.com metric authentication of boot loaders, Programmable Logic bitstream, and runtime Software

32 • FPGAs & PLDs Engineers’ Guide to FPGA and PLD Solutions 2014 Xilinx, Inc.

Artix-7 FPGA AC701 Evaluation Kit

The Artix®-7 FPGA AC701 Evaluation Kit features the leading system performance per watt Artix-7 family to get you quickly prototyping for your cost sensitive applications. This includes all the basic components of hardware, design tools, IP, and pre-verified refer- ence designs. This also features a targeted reference design enabling high-performance serial connectivity and advanced memory interfacing equipped with a full license for the Northwest Logic DMA engine.

Features & Benefits Availability ◆ Increased system performance featuring the Artix-7 XC7A200T FPGA: up to sixteen 6.6G GTs, Visit www.xilinx.com/ac701 930 GMAC/s, 13Mb BRAM, 1.2Gb/s LVDS, and DDR3-1066 Application areas Design Kits

◆ Enabling programmable system integration: Up to Industrial Automation, Wireless Communications, Software 215K logic cells; AMS integration, and AXI IP Acceleration, Linux/Android/RTOS development, embedded

Design KitsDesign ARM processing ◆ Small packaging and lower BOM costs

◆ 65% lower static and 50% lower power than previ- ous generation devices

◆ Scalable, optimized architecture, comprehensive design tools, and IP for design productivity

Technical Specs

◆ AC701 evaluation board featuring the XC7A200T- 2FBG676C FPGA

◆ Targeted Reference Design featuring DDR3, PCIe® and DMA Including a full license for the Northwest Logic DMA

◆ AMS 101 evaluation card

◆ Full seat Vivado® Design Suite: Design Edition - Device-Locked, Node-Locked to the Artix-7 XC7A200T FPGA

◆ Documentation including a step-by-step Getting Contact Information Started Guide

Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 USA (408) 559-7778 Telephone www.xilinx.com

www.eecatalog.com/fpga Boards & Kits • 33 Xilinx, Inc.

Kintex-7 FPGA KC705 Evaluation Kit

The Xilinx Kintex®-7 FPGA KC705 Evaluation Kit acceler- ates development and demonstration of radio/baseband, radar, EdgeQAM, triple-rate SDI, medical imaging and other applications for a broad range of markets that demand power-efficient high-speed communications and processing. In order to help get you to market quicker Xilinx has designed in the ideal feature set including on-board PCI Express, Analog Mixed Signal (AMS), HDMI video output, and FPGA mezzanine card (FMC) connectors.

This kit helps designers simultaneously deal with decreasing design cycles, increasing design complexity, Availability and project scope. Designers can boost productivity and accelerate access to the advanced functionality of Xilinx Visit www.xilinx.com/kc705 7 series FPGAs with a full-featured Kintex-7 evaluation board, Xilinx Vivado® Design Suite, and pre-verified ref- Application areas Design Kits erence designs. Aerospace/Defense, Medical Imaging, Features & Benefits Wireless Communications Design KitsDesign ◆ Kintex-7 FPGA family and 28nm leadership drive up performance while slashing price and power

◆ Integrated evaluation kit boosts developer, productiv- ity with a combination of silicon, software, IP and reference designs

◆ Provides scalability and migration for maximum design reuse and immediate start on Kintex-7 designs

Technical Specs

◆ KC705 Base Board with a XC7K325T-FF900C FPGA

◆ Full seat of Vivado Design Suite: Design Edition- Device-locked to the Kintex-7 XC7K325T FPGA

◆ Reference designs and demonstrations

◆ Board Design Files and Documentation including a step-by-step Getting Started Guide

◆ USB Cables, Ethernet Cable, universal power supply, Contact Information and AMS 101 Evaluation Card

Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 USA 408-559-7778 Telephone www.xilinx.com

34 • Boards & Kits Engineers’ Guide to FPGA & PLD Solutions 2014 Xilinx, Inc.

Virtex-7 FPGA VC707 Evaluation Kit

The Xilinx Virtex®-7 VC707 FPGA Evaluation Kit pro- vides a highly flexible base platform with an out of the box launch pad to evaluate and leverage designs that deliver breakthrough performance, capacity and power efficiency.

This kit provides a flexible environment for designs that need to implement a DDR3 memory interface, 10Gigabit Ethernet, PCI Express®, Analog Mixed Signal (AMS) capabilities, and other high-speed serial connectivity. Based on the Virtex-7 VX485T-2 FPGA, the kit is the optimal choice for advanced systems Technical Specs that need the highest performance and highest band- width connectivity, including advanced systems for ◆ VC707 Base Board with Virtex-7 XC7VX485T- wired and wireless communications, Aerospace and 2FFG1761C FPGA Defense, medical, and broadcasting markets. Design Kits ◆ Full seat of Vivado® Design Suite: Design Edi- The highly flexible kit combines fully integrated hard- tion. Device-Locked, Node Locked to the Virtex-7 ware, software and IP with pre-verified reference designs XC7VX485T FPGA

Design KitsDesign that maximize productivity and let designers immedi- ately focus on their unique project requirements. ◆ Reference designs and demonstrations (please see Xilinx.com for latest designs) and board Features & Benefits design files

◆ Virtex-7 FPGA family and 28nm leadership offer- ◆ Documentation including a step-by-step Getting ing new benchmarks for performance and 50% Started Guide power savings ◆ USB Cables, Ethernet Cable, and universal power ◆ Faster start-up with integrated silicon, software, supply IP and complete documentation Availability ◆ Rapid evaluations with Base Reference Designs and other pre-verified examples that exercise Visit www.xilinx.com/vc707 device and board features Application areas ◆ Convenient, easy-to-use GUI displays combine results from different implementations Wired, Wireless, Aerospace and Defense, Medical, Broadcasting

Contact Information

Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 USA (408) 559-7778 Telephone www.xilinx.com

www.eecatalog.com/fpga Boards & Kits • 35 Xilinx, Inc.

ZedBoard

ZedBoard is a low-cost development board for the Xilinx Zynq-7000® All Programmable SoC. This board contains everything necessary to create a Linux, Android, Windows® or other OS/RTOS-based design. Additionally, several expansion connectors expose the processing system and programmable logic I/ Os for easy user access. Take advantage of the Zynq- 7000 AP SoC tightly coupled with ARM® processing system and 7 series programmable logic to create unique and powerful designs with ZedBoard.

Features & Benefits

◆ Low-cost development board for embedded processing applications Design Kits

◆ Academic and commercial versions available Availability

◆ Online support community Visit www.zedboard.com Design KitsDesign ◆ Best-in-class tools, operating system support, and Application areas ecosystem (including ARM community) Industrial Automation, Wireless Communications, Software Technical Specs Acceleration, Linux/Android/RTOS development, embedded ARM processing ◆ Avnet ZedBoard 7020 baseboard with XC7020- CLG484-1 device

◆ Memory: 512MB DDR3, 256MB Quad-SPI Flash and 4GB SD Card

◆ Onboard USB-JTAG programming, 10/100/1000 Ethernet, USB OTG 2.0 and USB-UART, PS & PL I/O expansion (FMC, Pmod™ Compatible, XADC)

◆ Multiple displays (1080p HDMI, 8-bit VGA, 128 x 32 OLED) and I 2 S Audio CODEC

◆ Vivado® Design Suite: Design Edition - Device Locked, Node Locked License

Contact Information

Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 USA (408) 559-7778 Telephone www.xilinx.com

36 • Boards & Kits Engineers’ Guide to FPGA & PLD Solutions 2014 Xilinx, Inc.

Zynq-7000 All Programmable SoC ZC702 & ZC706 Evaluation Kits

Xilinx and Xilinx Partners provide a comprehensive offering of embedded processing development kits. These kits feature the Xilinx Zynq®-7000 All Pro- grammable SoC with ARM® dual-core Cortex™-A9 + 28 nm programmable logic and provide out-of-the box design solutions that significantly cut embedded development time and enhance productivity.

The Xilinx ZC702 features a 1080p60 real-time pro- cessing video stream design example. The Xilinx ZC706 kit features include a reference design dem- onstrating high speed multichannel DMA interfacing ◆ Industry standard processor, peripherals and flexible to PCIe Endpoint (x4 GEN2),Video DMA (VDMA) and accelerators; mix of parallel and serial I/Os support- Sobel filtering with a HDMI based display controller ing wide range of interface standards; ecosystem of and GTX serial transceivers. design tools, operating systems and IP Design Kits

The Avnet Software Defined Radio Kits combine Technical Specs Xilinx Zynq AP SoC with the latest generation of

Design KitsDesign Analog Devices high-speed data converters and ◆ Xilinx ZC702 Evaluation kit for basic embedded frequency-agile RF components. These kits target processing development and the Xilinx ZC706 for wireless communications from baseband to RF. PCIe and Transceiver based embedded development

The Xilinx ZVIK kit and the Omnitek OZ745 kits are ◆ Avnet Zynq SDR kits for wireless communication video development platforms for rapid development systems development of video and image processing designs. ◆ Xilinx ZVIK kit or the Omnitek OZ745 kit for video, Zedboard.org offers a number of low cost Zynq AP imaging, broadcast or surveillance video development SoC solutions including ZedBoard and MicroZed. These are a community-based Zynq-7000 SoC devel- ◆ Zedboard, MicroZed or ZyBo for low cost evalua- opment kits. Digilent also offers ZyBo, a low cost tion, development and learning platforms Zynq kit targeted at students, academics and other who want a low cost entry point to learn about the Availability Xilinx Zynq AP SoC products and the Vivado develop- ment tools. Visit www.xilinx.com/kits

Features & Benefits

◆ Providing customizable, flexible SoC to meet exact project needs

◆ Tightly integrated processor and programmable logic and high throughput standard interconnects Contact Information ◆ Integration of multiple system components into a single device and fewer NRE costs through pro- Xilinx, Inc. grammability and standard interconnects support 2100 Logic Drive San Jose, CA 95124 ◆ Reduced components and interconnection counts, USA (408) 559-7778 Telephone partial reconfiguration ability and programmable www.xilinx.com processor speed

www.eecatalog.com/fpga Boards & Kits • 37 Elma Electronic Inc.

3U VPX Front End FPGA Processor Single Board Computer with Enhanced Graphics (6U board also available)

Designed for applications requiring a very high level of computing power in a compact 3U form factor, the TIC- FEP-VPX3c board offers the highest bandwidth with the lowest power consumption.

The TIC-FEP-VPX3c and the other building blocks of our 3U OpenVPX product ranges (SBCs, Ethernet switches & routers, I/O boards & FMCs) are ideal platforms for customers who want to streamline signal processing ref- erence development.

The Virtex-7 FPGA interfaces with flash for local storage of up to 3 bitstreams. It supports secured bitstreams and partial reconfiguration, further improving the versatility Boards FPGA of the FPGA. A Clock tree allows synchronization from a unique clock selected between backplane REF_CLK+/-, FMC interfaces from an FMC or from a local oscillator. The fabric links of ◆ 8 GTX/GTH (FPGA dependent) the VPX backplane are connected to the FPGA GTX/ GTH ◆ 80 differential pairs FPGA Boards transceivers, allowing data rate of up to 13.1 Gbps. ◆ 4 reference clocks

The FMC site of the TIC-FEP-VPX3c is compliant with the Miscellaneous FPGA Mezzanine Card standard VITA 57.1 , allowing the ◆ PIC microcontroller for System Management installation of FMC modules. (per VITA 46.11) ◆ 8 LEDs As on all of our FMC sites, an optional I/O connector can ◆ 8 configuration switches route sixteen additional differential pairs from the FMC module directly to the P2 VPX connector. Accessories ◆ Engineering kit: JTAG ports for direct FPGA configuration Due to the modularity of the FPGA, the TIC-FEP-VPX3c is com- ◆ Rear Transition Module pliant with several Module Profiles of the OpenVPX standard. The TIC-FEP-VPX3c is a 3U VPX board is available in Features air-cooled (1“)and conduction cooled (0.8”) versions com- pliant with VITA 47 classes. Processing Unit ◆ Xilinx Virtex-7 XC7VX690T (other versions on demand) Please visit our website for details on the 6U VPX FPGA Board. ◆ Two banks of DDR3: 64-bit wide, 2GB each ◆ Optional QDRII+ 450MHz, 36-bit wide I up to 36 Mbit ◆ 28 MBytes of BPI NOR flash (bitstream storage)

VPX Interfaces ◆ Four 4-lanes fabric ports on P1 ◆ 16 GTX/GTH Fat Pipes A, B, C & D (FPGA dependent) Contact Information ◆ General purpose 1/0 on P2 ◆ 16 differential pairs from FPGA ◆ 16 differential pairs from FMC 1/0 connector Elma Electronic Inc. 510.656.3400 Telephone [email protected] www.elma.com

38 • Boards & Kits Engineers’ Guide to FPGA & PLD Solutions 2014 Pentek, Inc.

Model 52730: 1 GHz A/D, 1 GHz D/A with Virtex-7 FPGA

Compatible Architectures: OPENVPX

The 52730 factory-installed functions include an A/D acquisition and a D/A waveform playback IP module for simplifying data capture and data transfer. IP modules for DDR3 SDRAM memories, a controller for all data clocking and synchronization functions, a test signal generator and a PCIe interface complete the factory-installed functions and enable the 52730 to operate as a turnkey solution, without the need to develop any FPGA IP.

The Virtex-7 FPGA site can be populated with one of two FPGAs to match the specific requirements of the pro-

cessing task. Supported FPGAs are VX330T or VX690T. Boards FPGA Available in COTS and rugged VXS, PCIe, XMC, cPCI and AMC The 52730 includes GateXpress, a sophisticated FPGA- PCIe configuration manager for loading and reloading the FPGA. At power up, GateXpress immediately pres- ◆ Custom I/O Option -104: Provides 20 pairs of LVDS

FPGA Boards ents a PCIe target for the host computer to discover, connections between the FPGA and the VPX P2 con- effectively giving the FPGA time to load from FLASH. nector for custom I/O Option -105: Provides one 8X or This is especially important for larger FPGAs where the two 4X gigabit links between the FPGA and the VPX loading times can exceed the PCIe discovery window, P1 connector to support serial protocols typically 100 msec on most PCs. ◆ Memory Type: DDR3 SDRAM Size: Four banks, 1 GB each Speed: 800 MHz (1600 MHz DDR) Features & Benefits Application areas ◆ Supports the latest family of Xilinx Virtex-7 VXT FPGAs Radar, Communications, Cellular, Direction-Finding, ◆ Pentek’s GateXpress supports dynamic FPGA Beamforming reconfiguration across PCIe ◆ One 1 GHz 12-bit A/D, One 1 GHz 16-bit D/A and 4 GB AVAILABILITY of DDR3 SDRAM ◆ PCI Express (Gen. 1 and 2) interface up to x4 Contact Pentek for price and availability at ◆ Compatible with several VITA standards including: http://www.pentek.com/go/em52730 or 201-818-5900. VITA-46, VITA-48 and VITA-65 (OpenVPX System Specification)

Technical specs

◆ A/D Converter Type: Texas Instruments ADS5400 Sampling Rate: 100 MHz to 1 GHz Resolution: 12 bits ◆ D/A Converter Type: Texas Instruments DAC5681Z Input Data Rate: 1 GHz max. Interpolation Filter: Contact Information bypass, 2x or 4x Output Sampling Rate: 1 GHz max. Resolution: 16 bits Pentek, Inc, One Park Way ◆ Field Programmable Gate Array Standard: Xilinx Upper Saddle River, New Jersey Virtex-7 XC7VX330T-2 Optional: Xilinx Virtex-7 07458, USA XC7VX690T-2 201-818-5900 201-818-5904 [email protected] www.pentek.com

www.eecatalog.com/fpga Boards & Kits • 39 Xilinx, Inc.

Vivado Design Suite

Programmable devices are at the heart of most sys- tems today, enabling not only programmable logic design, but programmable systems integration. Xilinx has transformed from an FPGA company to an ‘All Programmable’ company, offering technology from logic and IO to SW programmable ARM® processing systems and beyond. THEC NICAL SPECS With the next decade of programmable platforms, comes the next generation design environment that ◆ Vivado Design Suite: WebPACK Edition - The No meets the aggressive pace and the need for enhanced cost, device-limited version of the Vivado Design productivity. Xilinx introduces the Vivado® Design Suite: Design Edition Suite, an IP and system-centric design environment built from the ground up to accelerate productivity ◆ Vivado Design Suite: Design Edition - Simplified for the next generation of ‘All Programmable’ devices. design integration and implementation. The Design The new Vivado Design Suite is already proven to Edition enables new levels of flexibility for designing, accelerate integration and implementation by 4x over integrating, implementing and reusing modules. traditional design flows, reducing cost by simpli- fying design and automating, not dictating, a flexible ◆ Vivado Design Suite: System Edition – The same design environment. front-to-back support as Design Edition plus Vivado High-Level Synthesis and System Generator for DSP Vivado Design Suite provides a highly integrated design environment with a completely new generation of system-to-IC level tools, all built on the backbone AVAILABILITY of a shared scalable data model and a common debug environment. It is also an open environment based Download today www.xilinx.com/download on industry standards such as AMBA® AXI4 inter-

connect, IP-XACT IP packaging metadata, the Tool Design Tools Command Language (Tcl), Synopsys® Design Con- straints (SDC) and others that facilitates customized design flows.

Design Tools Design Vivado was architected to enable the combination of all types of programmable technologies and scale up to 100M ASIC equivalent gate designs.

Features & Benefits

◆ Next generation of system-to-IC level tools, built on the backbone of a shared scalable data model and a common debug environment

◆ 4x productivity advantage drives beyond program- mable logic to programmable systems integration Contact Information

◆ All Programmable’ device support including 3D Xilinx, Inc. stacked silicon interconnect technology, ARM pro- 2100 Logic Drive cessing systems and Analog Mixed Signal (AMS) San Jose, CA 95124 USA 408-559-7778 Telephone www.xilinx.com

40 • Development Tools Engineers’ Guide to FPGA & PLD Solutions 2014 NEW_2014_FPGA.pdf 1 28/11/2013 14:30

C

M

Y

CM

MY

CY

CMY

K Untitled-1 copy.pdf 1 12/6/13 3:57 PM

Delivering A Generation Ahead

Leading 28nm FPGA, SoC, and 3D ICs Now Complemented by 20nm UltraScale™

Architecture

C

M

Y

CM

MY

CY

CMY

K

www.xilinx.com