<<

Nexperia PNX1500 family of connected media processors

High-performance multimedia engines for next-generation streaming devices

Exceptional performance handling a broad range of popular digital media formats makes the highly integrated NXP Nexperia PNX1500 family of media processors an ideal choice for connected, multimedia consumer devices.

Key features • Supports up to 256-MB DDR SDRAM Real-time multimedia processing and • Real-time, single-chip media processing memory system (16- or 32-bit wide extensive connectivity make every PNX1500 • Handles popular video, audio, graphics data) at rates up to 400 MHz (1.6 GB/s) processor an ideal single-chip solution for and communication standards such as • Comprehensive software development an ever-increasing variety of standalone - MPEG-2, MPEG-4, H.263 en/decode tools and application libraries enable and networked multimedia products. The - H.264, DV, and DivX-5 decode development entirely in /C++ PNX1500 family builds on the featureset - MP3 and AAC encode/decode of previous Nexperia media processors, - TCP/IP, Ethernet, and UPnP Key applications with faster clock speeds, a TFT LCD con- • Innovative 32-bit TriMedia CPU with • Personal video recorders troller and an Ethernet 10/100 MAC to powerful multimedia and fl oating point • Connected DVD players support new media formats and advanced instructions • Wireless LAN devices and advanced product confi gurations with fewer external • On-chip, independent, DMA-driven I/O home gateways components. and coprocessing units • IP set-top boxes • Video output up to W-XGA TFT LCD • Smart display pads PNX1500 media processors are supported (1344 x 768 60p) and up to HD (1920 x • Personal media players by a comprehensive software development 1080 60i) • Videoconferencing devices environment enabling application develop- • Image scaling, advanced de-interlacing ment entirely in C and C++. An extensive • 2D engine accelerates complex collection of applications libraries, available graphics for real-time overlay from NXP and third parties, improve time- • Full DVD playback to-market, reduce design cycles, and lower product development costs. Architectural overview result is a low-cost, programmable media simultaneous RISC-like operations to be Every PNX1500 leverages a powerful, processor proven in standalone and hosted scheduled into only one VLIW instruction. programmable CPU that runs a small real- multimedia products. These operations can simultaneously time operating system for effi cient and target any fi ve of the CPU’s 31 pipelined predictable response to real-time events. C/C++-programmable VLIW CPU functional units within one clock cycle. Independent, on-chip, bus-mastering At the heart of all PNX1500 processors is DMA units capture and format datastream a TriMedia TM3260 CPU core delivering In addition to a full complement of I/O and accelerate processing of multi- top performance through an elegant traditional 32-bit integer and IEEE-754 media algorithms. A sophisticated memory implementation of a fi ne-grain parallel compliant fl oating-point microprocessor hierarchy manages internal I/O and stream- very-long instruction word (VLIW) archi- operations, the TriMedia TM3260 instruc- lines access to external memory. The tecture. Five issue slots enable up to fi ve tion set includes an extensive set of multi-

On a single chip, a PNX1500 accelerates processing of audio, video, graphics, control, DDR SDRAM and communications datastreams.

MAIN MEMORY INTERFACE

VIDEO IN QVCP/LCD CCIR656 CCIR656, HD, or data VGA, LCD, data FAST GENERIC FAST GENERIC IN ROUTER

PARALLEL IN PARALLEL OUT OUT ROUTER

2 2 I S audio AUDIO IN AUDIO OUT I S audio

SPDIF audio SPDIF IN SPDIF OUT SPDIF audio

Ethernet JTAG PHY 10 /100 MAC TRIMEDIA SW DEBUG

VIDEO SCALER 27 MHz BOOT, RESET, CLOCKS XTAL AND DE-INTERLACER

I2C MISC I/O, TIMERS, 2D DRAWING ENGINE COUNTERS, & GPIO SEMAPHORES VLD COPROCESSOR

INSTR CACHE TM3260 DVD DESCRAMBLER CPU DATA CACHE to PCI INTERFACE PCI/XIO bus

INTERNAL BUS

2 media operations and single instruction UME8UU: sum of absolute values of unsigned 8-bit differences multiple data (SIMD)-style ‘special’ opera- tions (ops) for dual 16-bit or quad 8-bit source register 1 source register 2 31 0 31 0 packed data. By combining multiple simple operations, a single special op can imple- ABCD EFGH ment up to 12 traditional microprocessor operations. In this way, up to 40 traditional operations can be executed in a single VLIW instruction. When incorporated into DSPALU |A - E|+++ |B - F| |C - G| |D - H| functional application source code, special ops unit dramatically improve performance and increase the effi ciency of a PNX1500 CPU’s 31 0 parallel architecture. destination result register On-chip I/O and coprocessing units

Video input processor (VIP) A PNX1500 CPU’s special ‘ops’ dramatically improve performance and increase the effi ciency of The VIP unit captures and processes digital its parallel architecture. The ume8uu operation, commonly used for motion estimation in video video for use by on-chip units. It accepts compression, implements 11 simple operations in one TriMedia special op. up to 10-bit parallel YUV 4:2:2 digital video from any device or component that outputs a CCIR656-compliant stream or a YUV Video scaler and de-interlacer In addition to two-layer video compositing, stream with separate H and V syncs. A versatile, programmable memory-based the QVCP unit integrates scaling, a TFT During capture of a continuous video scaler unit applies a wide variety of image LCD controller and a long list of video stream, the VIP unit can crop, horizontally size, color, and format manipulations to quality enhancements including deindexing down-scale, or convert the YUV video to improve video quality and prepare it for or gamma equalization, contrast and one of many standard pixel formats as display. It handles de-interlacing (with brightness control, luminance sharpening, needed before writing data to memory. optional edge detection/correction), H and horizontal dynamic peaking, skin tone When streaming video from TV broadcasts, V scaling, linear and non-linear aspect correction, dithering, and generation of it can also capture raw VBI data into a ratio conversion, anti-fl icker fi ltering, pixel screen timing required by the target separate window in memory. This unit format conversions, and more. display. shares its pin interface with a fast generic parallel input unit through an input router. Quality video composition processor The QVCP unit outputs the resulting video (QVCP) datastream to any of a wide variety of off- Fast generic parallel input (FGPI) The QVCP unit composites two planes of chip video subsystems supporting CCIR656, The FGPI unit captures unstructured, display data from different sources before YUV, or RGB formats, progressive or infi nite parallel datastreams, messages, or output. It supports either two video planes interlaced scan modes, and resolutions control signals — any datastream with no or one video plane and one graphics plane, up to W-XGA TFT LCD (1344 x 768 60p) YUV processing requirements. When raw such as video from DVD playback and or SD/HD video (up to 1920 x 1080 60i). mode is enabled, an 8-, 16-, or 32-bit graphics from a web browser. Working The QVCP unit shares its pin interface with parallel datastream is captured continu- with on-chip 2D graphics and memory- a fast generic parallel output unit through ously and double buffered into memory. based scaler units, QVCP enables an output data router. For example, the FGPI unit can receive an PNX1500 processors to support many ATSC transport stream from an external high-speed multimedia applications with channel decoder or a second 656 ITU input. few external components.

3 Fast generic parallel output (FPGO) S/PDIF input and output 2D drawing engine (2D DE) The FPGO unit can output any raw data- An S/PDIF input unit connects to external An on-chip 2D rendering and DMA engine stream with no video post processing sources of digital audio, such as a DVD accelerates high-speed 2D graphics requirements, for example, an ATSC bit- player, to receive audio datastreams in a operations including solid fi lls, lines, three- stream. It can also broadcast unidirectional variety of formats, including stereo PCM operand bitblts, and color expansion of messages to other NXP media processors. data, 5.1-channel Dolby AC-3® data (per monochrome data to any supported pixel IEC-1937), and more. An S/PDIF output format. A full 256-level alpha bitblt blends Audio input (AI) and audio output (AO) unit outputs a high-speed serial data- source and destination images together. Highly programmable AI and AO units stream. Primarily used to transmit digital provide all signals needed to read and S/PDIF-formatted audio data to external Variable length decoder (VLD) write digital audio datastreams from/to audio equipment, it can also be used to The VLD coprocessor offl oads the CPU most high-quality, low-cost serial audio output two-channel linear PCM audio from during decode or transcode of Huffman- oversampling A/D and D/A converters an internal audio mix or captured, com- encoded MPEG-2 and MPEG-1 data- and codecs. Both units connect to off-chip pressed multi-channel audio streams such streams. It outputs a decoded stream (to stereo converters through fl exible bit-serial as Dolby AC-3, DTS, or AAC (per Project memory) optimized for MPEG decom- I2S interfaces. Their high level of program- 1937). Software-decoded audio can be pression software. mability provides tremendous fl exibility in mixed with other audio before output. handling custom datastreams, adapting DVD descrambler to custom protocols, and upgrading to Both units have independent, program- An on-chip DVD descrambler unit handles future audio standards. mable sample rates ensuring perfect DVD authentication and descrambling synchronization to any time reference in tasks, enabling PNX1500 to integrate The AI unit supports capture of up to eight the system. Datastream content is soft- complete DVD datastream playback. An channels of stereo audio. In raw mode, it ware generated and software controlled. IDE DVD drive can be attached directly captures any quantity of bits from the to the PCI/XIO interface. programmable frame. The AO unit outputs up to eight channels and directly drives up to four external, stereo I2S or similar PNX1500 processors are designed for use as a main processor in standalone systems and as a D/A converters or highly integrated PC coprocessor in a hosted or multiprocessor environment. codecs. Software support for decode and output of DTS is provided through DDR SDRAM optional application library modules.

Nexperia PCI agent PNX1500

PCI/XIO bus

IDE Flash PCI agent

DDR SDRAM

Nexperia host CPU PNX1500 PCI bridge PCI/XIO bus

Flash/IDE PCI agent PCI agent

4 Memory system High-speed internal bus of 32-bit PCI master/slave devices as well Main memory is coupled to substantial A PNX1500’s CPU and processing units as separate address/data-style 8- and 16- on-chip caches through a glueless main access external memory through an bit microprocessor slave peripherals, memory interface and internal bus system. internal bus system comprising separate standard (NOR) or disk-type (NAND) 64-bit data and 32-bit address buses. Flash memories, or an IDE disk interface. Glueless main memory interface (MMI) Arbitrated by the MMI unit, the internal The MMI acts as the main memory buses maintain real-time responsiveness TriMedia software debug (TMDBG) controller and programmable central in a variety of applications. unit/JTAG port arbiter, allocating memory bandwidth for Remote debugging of software running on-chip unit activities. MMI provides a 16- Control and connectivity on the CPU core can be performed using or 32-bit DDR SDRAM interface. The 32- The PNX1500’s versatile interfaces and the TriMedia interactive source debugger. bit interface is equivalent to a 64-bit SDR control options support many advanced A JTAG port connects a PC (running the SDRAM interface running at 200 MHz, product confi gurations. debugger) to the PNX1500’s TMDBG unit, resulting in theoretical maximum band- enabling full support for interactive width of up to 1.6 GB/s. Programmable I2C interface debugging features. The JTAG port is memory timing parameters enable the An I2C master/slave external interface also used for boundary scan. MMI memory controller to support most operates in both standard (100 kHz) and DDR SDRAM devices. Memory clock fast (400 kHz) modes. It can connect to an General purpose I/O (GPIO) and speed is programmable and independent optional EEPROM for boot and can be fl exible serial interface of the PNX1500 CPU clock, eliminating used to control a variety of different I2C Sixteen dedicated GPIO I/O pins support the top-speed limitations of fi xed memory/ board-level devices. software I/O, external interrupt input, CPU clock ratios. Flexible memory confi g- universal remote control (RC) blaster urations support memory footprints from 10/100 Ethernet MAC transmission, and signal sampling and 8 to 256 MB, enabling a wide variety of An Ethernet MAC (sublayer of the IEEE pattern generation for emulating high- systems to be built. 802.3 standard) enables an external PHY speed serial protocols. chip to be attached through a standard Dedicated instruction and data cache media independent interface (MII) or IR remote control receive and transmit All PNX1500 TriMedia CPUs are supported reduced MII interface (RMII). It implements PNX1500s use the GPIO pin event by separate, dedicated on-chip data and dual-transmit descriptor buffers, supporting sequence timestamping mechanism and instruction caches employing a variety of both real-time and non-real-time traffi c. software event interpretation to execute techniques to improve cache hit ratios and Quality of service (QoS) is ensured through RC commands. This approach supports a CPU performance. The 16-KB eight-way, low- and high-priority transmit queues. wide variety of RC protocols including set-associative data cache supports dual RC-5, RC-6, and RC-MM. accesses per cycle. It is non-blocking thus Timers cache misses and CPU cache accesses can Four 32-bit general purpose timers Dynamic power management be handled simultaneously. Early restart support performance analysis, real-time PNX1500 processors enable devices to techniques reduce read-miss latency. interrupt generation and/or system event conserve power by tailoring frequency to Background copyback reduces CPU stalls. counting. application requirements. Their software- programmable clocks enable the CPU to A 64-KB, 8-way set-associative instruction PCI/XIO bus interface run at lower speeds, reducing power cache provides 224 bits of instructions A PCI/XIO interface connects the CPU consumption during less cycle-consuming every clock cycle. To reduce internal bus and on-chip units to a variety of board- tasks. For example, decoding an MP3 bandwidth requirements, instructions in level memory components and off-chip audio stream requires less than 30 MHz main memory and cache are compressed. devices. It allows simultaneous connection of CPU cycles. Power is conserved by

5 adjusting the clock speed on the fl y to TriMedia application libraries • DV decode service this lower cycle requirement. The Many application libraries are available • DivX decode CPU clock can be shut down any time the from NXP and third-party suppliers. • H.32x encode/decode CPU is idle. In addition, each co-processing These C-callable routines are optimized • H.263 encode/decode unit can be powered down by removing for top performance on the TriMedia CPU • Dolby AC-3 decode the clock, thus conserving additional and include modules for functions such as: • MP3 encode/decode power when the unit is not being used. • MPEG-1 encode/decode • AAC encode/decode • MPEG-2 encode/decode • TCP/IP, Ethernet, Universal PnP protocols Robust software development • MPEG-4 encode/decode • and more. environment • H.264 decode PNX1500 processors are supported by a full suite of system software tools to compile and debug code, analyze and Unique to the TriMedia core’s VLIW implementation, parallelism is optimized at compile optimize performance, and simulate time by an innovative compilation system. execution of the TriMedia CPU core. This comprehensive software development ANSI C/C++ application environment dramatically lowers application & device libraries / special ops development costs and reduces time-to- market by enabling development of multimedia applications entirely in the C and C++ programming languages. TriMedia compilation system

Nexperia PNX1500 processors preserve investments in software development performance optimization tools through compatibility between PNX1500 family members at the source code level. Powerful, optimizing compilers ensure that programmers never need to resort to non-portable assembler programming. optimized executable application As evolutionary hardware and software enhancements are incorporated into newer PNX processors, increased performance is often achieved by recompiling applica- cycle-accurate debugger tion software. simulator

Nexperia PNX1500

6 Specifi cations

Physical Video input processor unit (VIP) Video output unit (QVCP) Process/Package 0.13-μm CMOS 456 BGA External interface 38 pins: 32 data, 2 clock, Data formats 24- or 30-bit full parallel Power supply core 1.3 V and 2 validity signals RGB or YUV, DDR 2.5 V Formats CCIR 601/656: 10-bit video 16- or 20-bit Y and U/V I/O 3.3 V (5 V tolerant) (up to 40.5 Mpix/sec); multiplexed data, Power consumption 1.5 W typical at 266 MHz HD video (using 20-bit 8- or 10-bit 656 (full D1, Case temperature PNX150x 0 to 85ºC YUV input mode) 4:2:2 YUV), PNX1520 -40 to 85ºC Clock rate Up to 81 MHz pixel clock 8- or 10-bit 4:4:4 format in Functions Programmable on-the-fl y 656-style with RGB or YUV Central processing unit horizontal scaling Video resolutions TFT LCD W-XGA (1344

Type TriMedia TM3260 VBI formats Closed Captioning, x 768 60p); SD/HD up to Clock speed 240, 266, and 300 MHz Teletext, NABST, CGMS, 1920 x 1080 60i Issue slots 5 and WSS Clock rates Up to 81 MHz Address space 32-bit, linear Functions 2-layer compositing, Instruction set Arithmetic and logical, FGPI and FGPO picture quality enhance- load/store, special I/O data rate Up to 100 MHz for 8-, 16- ments, gamma correction, multimedia and DSP, IEEE- or 32-bit parallel data and horizontal 10-tap scaling, 754 compliant fl oating pt. messages genlock mode Data types Boolean, 8-, 16- and 32-bit Aggregate input band- signed and unsigned width up to 400 MB/s Audio input & output units (AI & AO) integer, 32-bit IEEE fl oats Sample size 8 channels, 16- or 32-bit Functional units 31 pipelined: integer and Video scaler & de-interlacer unit (MBS) samples per channel fl oating-point arithmetic Scaling Simultaneous V and H Sample rates Programmable with units, data-parallel DSP- scaling with linear and 0.001-Hz resolution; like units non-linear aspect-ratio maximum is application Registers 128 fully general purpose, conversion dependent 32 bits wide, non-banked De-interlacing Simple median, majority- Data formats 16-bit (mono and stereo), Interrupts 64 auto-vectoring, with selection (i.e. best of three 32-bit (mono and stereo), 8 programmable priority algorithms), simple fi eld PC standard memory data levels insertion and line doubling, format Byte order Big or little endian or high-end, NXP edge- Clock source Internal or external dependent de-interlacing Native protocol I2S over serial 6-wire Caches (EDDI) algorithm protocols

Access data 8-, 16-, or 32-bit words Filtering Programmable up to 6-tap instruction 64 bytes polyphase fi lters SPDIF input & output units (SPDI & SPDO) Associativity 8-way set-associative Color/Formats Variable color space Sample size 6 channels,16 or 24 bits with hierarchical LRU conversion; conversions per channel replacement between 4:2:0, 4:2:2 and Bit rate Up to 40 Mbits/s raw mode Block size 64 bytes 4:4:4; color-key and alpha Native protocol IEC-958, 1 wire Size data 16 KB processing instruction 64 KB Performance Up to 120 Mpix/s

7 Specifi cations (continued)

2D drawing engine Ethernet MAC Timers Functions Solid fi lls, 3-operand bitblt, Interface(s) 10/100 IEEE 802.3, MII, Number 4 lines, monochrome data RMII Sources (prescaled) CPU clock, expansion, 256-level alpha Functions Real-time traffi c, QoS data or instruction break- bitblt (to blend 2 images), points, cache events, video anti-aliased lines and fonts PCI/XIO bus interface I/O clocks, audio I/O word Formats 8-, 16-, and 32-bit/pixel Width 32-bit data, 32-bit address strobe space GPIO Variable length decoder unit (VLD) Speed 33-MHz PCI 2.2 interface Dedicated pins 16 Functions Parse MPEG-1 and -2 with integrated PCI bus Functions Software I/O, external elementary bitstreams, arbiter up to 4 masters interrupt, universal RC generate run-level pairs, Voltage 3.3 V (5 V tolerant) blaster, clock source/gate fi ll macroblock headers Functions PCI master and slave for system event timers/ 8- and 16-bit NAND or counters, emulating high- DVD descrambler unit (DVDD) NOR Flash memories speed serial protocols Functions Authentication and IDE controller descrambling Memory system I2C interface Speed Up to 200 MHz (1.6 GB/s) Modes Master and slave Memory size 8 to 256 MB Addressing Up to 10-bit Supported types 64 to 512 Mbit DDR Speed standard 100 kHz SDRAM devices fast 400 kHz Width 16- or 32-bit bus Signal levels 2.5 V SSTL-II

Use of this product in any manner that complies with the MPEG-2 Standard is expressly prohibited without a license under applicable patents in the MPEG-2 patent portfolio, which license is available from MPEG LA, L.L.C., 250 Steele Street, Suite 300, Denver, Colorado 80206.

Dolby and Dolby AC-3 are registered trademarks of Dolby Laboratories. Other brands and product names are trademarks or registered trademarks of their respective owners.

© 2007 NXP B.V. All rights reserved. Reproduction in whole or in part is prohibited without the prior written consent of the copyright owner. The Date of release: February 2007 information presented in this document does not form part of any quotation or contract, is believed to be accurate and reliable Document order number: 9397 750 15926 and may be changed without notice. No liability will be accepted by the publisher for any consequence of its use. Publication Printed in the USA thereof does not convey nor imply any license under patent- or other industrial or intellectual property rights.