Matrox Odyssey Ecl/XCL Leading-Edge Vision Processor Board
Total Page:16
File Type:pdf, Size:1020Kb
Vision processors Matrox Odyssey eCL/XCL Leading-edge vision processor board. Evolutionary architecture with leading-edge performance Matrox Odyssey eCL/XCL is a fourth generation vision processor board that optimally combines the latest off-the-shelf and custom technologies within an established architecture to deliver leading-edge performance and value. Designed with demanding semiconductor inspection, medical imaging, print inspection, surface inspection and signal processing applications in mind, the Matrox Odyssey eCL/XCL is the ideal choice for applications with data acquisition and processing rates in the order of hundreds of MBytes per second and/or where the PC is heavily loaded with other system activities. The premier embedded microprocessor, state-of-the-art proprietary processor Key features and router ASIC, DDR memory, and PCIeTM/PCI-X® connectivity come together on the Matrox Odyssey eCL/XCL to provide unrivaled power for a single vision x4 PCIe™ (eCL) or PCI-X® (XCL) card processor board. All this power and flexibility is accessed through an easy- ™ G4 PowerPC and proprietary ASIC to-learn programming environment compatible with Matrox Imaging’s previous combine for over 130 BOPS1 generation vision processor and incorporating elaborate image processing and over 5 GB per second of memory bandwidth analysis algorithms. 512 MB of DDR SDRAM memory State-of-the-art Matrox Oasis ASIC integrated Camera Link® frame The Matrox Imaging designed Oasis ASIC is the pivotal component of the grabber acquires up to 680 MB per second Matrox Odyssey eCL/XCL. A high-density chip, the Matrox Oasis integrates a up to 1 GB per second of I/O bandwidth CPU bridge, Links Controller, main memory controller and Pixel Accelerator. to host PC available software is sold separately and includes Matrox Imaging Library (MIL)2 and Matrox Odyssey Developer’s Toolkit host OS support for 32/64-bit Microsoft® Windows® XP/Vista®/7 and 32-bit Linux® royalty-free redistribution of MIL and ONL run-time environments2 Matrox Odyssey eCL/XCL - dual Base version UART G4 PowerPC 2 Serial Tx / / (MPC7447A) 2 LVDS Serial Rx / / 8 Buffers PSG #1 MDR26 Camera Control / / 8 64 28-bit / Data / / TM (up to 1.3 GB/s) 2 ChannelLink 24 2 x 256 x 8-bit / / and Clock Receiver 2 x 4K x 12-bit 32 Video LUTs 64 to / LINK 1 32 PCI-X (up to 800 MB/s) 24 2 x 256 x 8-bit UART and Bridge 2 x 4K x 12-bit 128 512 MB LUTs Matrox / DDR (up to 5.3 GB/s) SDRAM 2 Oasis Serial Tx / / 2 LVDS FLASH Serial Rx / / EEPROM 8 Buffers PSG #2 LINK 0 Camera Control / / MDR26 8 64 28-bit / Data / / TM (up to 1 GB/s) 2 ChannelLink Clock / / Receiver PCI-X® to PCIeTM (eCL) ® 6 TTL or PCI-X (XCL) Bridge Aux. I/O (6) / / Transceivers 4 Clock (2) / / 4 DB-44 Hsync (2) / / and 4 Vsync (2) / / LVDS TM ® DB-9* 8 Transceivers x4 PCIe (eCL) 5V/3.3V PCI/PCI-X (XCL) Aux. In (4) / / 8 (up to 1 GB/sec) 64-bit (up to 1 GB/sec) Aux. Out (4) / / 8 Aux. In (4) / / Opto-couplers * Present on a separate bracket Matrox Odyssey eCL/XCL - single full version UART G4 2 Serial Tx / PowerPC 2 LVDS Serial Rx / / 8 Buffers PSG (MPC7447A) MDR26 Camera Control / / 8 Data / 28-bit TM 2 ChannelLink 24 64 Clock / Receiver / (up to 1.3 GB/s) 8 2 x 256 x 8-bit Video Data 28-bit TM 16 64 and 64 to 64 2 ChannelLink 2 x 4K x 12-bit / / LINK 1 Clock Receiver LUTs PCI-X (up to 800 MB/s) Bridge MDR26 128 512 MB 8 24 Matrox / DDR Data 28-bit TM (up to 5.3 GB/s) SDRAM 2 ChannelLink Oasis Clock Receiver FLASH EEPROM LINK 0 6 Aux. I/O (6) TTL Transceivers / 64 (up to 1 GB/s) 4 Clock (2) 4 ® TM DB-44 Hsync (2) PCI-X to PCIe (eCL) 4 and Vsync (2) LVDS or PCI-X® (XCL) Bridge DB-9* 8 Transceivers Aux. In (4) 8 Aux. Out (4) 8 Aux. In (4) Opto-couplers x4 PCIeTM (eCL) 5V/3.3V PCI/PCI-X® (XCL) * Present on a separate bracket (up to 1 GB/sec) 64-bit (up to 1 GB/sec) 2 Pixel Accelerator Message streams are for all types of inter-processor The Pixel Accelerator (PA) is a parallel processor core, which communications. The LINX handles message streams considerably accelerates neighborhood, point-to-point and between the processing node and the host PC independently LUT mapping operations. It consists of an array of 64 processing of video streams. Message passing relies on hardware- elements all working in parallel. Each processing element assisted mechanisms for low overheads and real-time operation. has a multiply-accumulate (MAC) unit and an arithmetic-logic unit Together, the above capabilities off-load the CPU and PA (ALU). from data management tasks so they can focus on image processing tasks. The MAC unit is capable of performing a single 16-bit by 16-bit, two 8-bit by 16-bit or four 8-bit by 8-bit multiplies with 40-bit Best-of-breed freescale™ G4 PowerPC™ microprocessor accumulation per cycle for convolution operations. The 40-bit The CPU that controls activities on the Matrox Odyssey eCL/ accumulator guarantees no overflow situation for a 16 by 16 XCL and performs operations not accelerated by the PA is the kernel with 16-bit coefficients and data. In addition, the PA freescale™ G4 PowerPC™ microprocessor. The G4 combines architecture allows symmetrical kernels to be processed four the best features of a general purpose CPU and a DSP, and times faster. The MAC unit is also able to perform up to four provides top performance at a given clock rate. The G4 is also minimum or maximum operations per cycle for grayscale backed by freescale™’s solid migration path for increased morphology operations. performance, all the while maintaining code compatibility. The ALU can execute a wide variety of arithmetic and logical The G4 incorporates a powerful 32-bit superscalar RISC and operations. It can be programmed to execute a sequence of AltiVec™ technology’s 128-bit vector execution unit. 512 KB 256 instructions per pixel at one instruction per cycle reducing of internal L2 cache helps sustain maximum processor the amount of memory accesses and further accelerating performance. A 64-bit MPX bus offers efficient access to memory I/O-bound sequences. The PA can accept up to four main memory and provides a sustained bandwidth close to its source buffers3 and output to four destination buffers allowing theoretical maximum of 1.3 GB per second. several operations to be performed at once or in a single pass (i.e., four images can be averaged in one pass). Operating at a core frequency of 167 MHz enables the PA to carry out up to 100 BOPS1 (i.e., process over two billion pixels per second). AltiVec™ technology is specifically designed to meet the heavy computational requirements of applications such as video and Memory controller image processing. This technology consists of a high-performance The Matrox Oasis includes a very efficient main memory controller parallel processing engine for vector data. It uses the SIMD for managing the 128-bit wide interface to DDR SDRAM (single instruction, multiple data) model to process, in parallel, memory. Operating at 167 MHz, the DDR SDRAM memory and up to 16 pixels per cycle. It delivers a peak processing power controller combine to deliver a memory bandwidth in excess of 16 billion 8-bit MACs per second or 8 billion 32-bit of 5 GB per second. Such ample memory bandwidth allows floating point operations per second when running at 1 GHz. the Matrox Odyssey eCL/XCL to comfortably handle demanding Additionally, AltiVec™ technology operates concurrently with video I/O while maintaining PA performance even for memory other execution units within the G4. I/O-bound operations. Links Controller The Links Controller (LINX) is the router that manages all data movement inside and outside the processing node, which Choice of high-performance host bus interfaces consists of the PA, CPU and main memory. It can handle several Four lane (x4) PCIe™ and PCI-X® are the interfaces used to concurrent video and message streams. connect to the host PC on the Matrox Odyssey eCL and Matrox Odyssey XCL boards respectively. PCIe™ is the follow-on to Video streams are used to transfer image data from the conventional PCI and PCI-X®. Version 1.x of PCIe™ operates integrated frame grabber to the processing node and the at 2.5 GHz to deliver a peak bandwidth of 1GB/sec over a x4 processing node to the host PC including display. The video implementation. PCI-X® is a high-performance backwards- streams have adjustable priority levels, either above or below compatible enhancement to conventional PCI. Version 1.0a of message streams. Video streams can be subject to various PCI-X® specifies a 64-bit physical connection running at speeds formatting operations including plane separation on input of up to 133 MHz resulting in a peak bandwidth of up to 1 GB and merging on output, input cropping, input and output sub- per second. sampling (1 to 16), and independent control of horizontal and vertical scanning direction. The latter is particularly useful for Flash EEPROM for full autonomy reconstructing a proper image from a camera whose readout Matrox Odyssey eCL/XCL has a flash EEPROM that stores requires multiple taps, each with different scanning directions. the G4 PowerPC™ boot sequence, system initialization parameters and a debugging utility.