ZZ0_Index.qxp 3/15/2005 3:45 PM Page 419

Index

A array data, 130 ABEL, 21 splitting, 140-144, 199-201, 227 design file, 22 two-dimensional, 131 abstraction, 8 Assisted Technology, 21 , 3 Avalon bus, 26, 83 Advanced Boolean Expression Language. See ABEL B algorithm, 5 acceleration, 25, 41 bandwidth considerations, 6, 14, 40 concurrency, 5 Base System Builder, 178 partitioning, 5 basic blocks, 139 validation, 36 behavioral simulation, 12 , 3, 23 behavioral synthesis, 13 SOPC Builder , 178 Bekker, Scott, 325 ANSI C, 94, 129 Birkner, John, 19 application bitmap image file, 213 bottleneck, 14 block, C language, 121 characteristics, 42 BMP format, 285 domains, 41 buffer width, 64 monitoring, 14, 98, 158 parallelism, 14 C prototyping, 11 structure of, 56 C++, 4 supercomputing, 301 C language, 4, 13 Application Monitor, 95 analysis, 104 arithmetic operation, 129 constraints for hardware, 129 ARM, 7, 26 debugger, 123 expanded source, 121

419 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 420

420 Index

for hardware design, xv co_register_read, 359-360 loop unrolling, 106 co_register_write, 360-361 optimizer, 127 co_signal_create, 361-362 pointer, 130 co_signal_post, 362-363 preprocessing, 104 co_signal_wait, 74, 115, 363 programming, 11, 195 co_stream_close, 53, 113, 364 semantics, 135 co_stream_create, 57, 64, 365-366 Celoxica, 42 co_stream_eos, 68, 366 Center for Computational Biology, 326 co_stream_open, 53, 367 Chameleon, 28 co_stream_read, 368-369 CHAR_TYPE, 66 co_stream_read_nb, 296, 369-370 Chua, H. T., 19 co_stream_write, 53, 67, 370-371 CISC, 25 coarse-grained, 37 clock column generator, 223 connecting, 185 Common Universal tool for Programmable dual, 177 Logic. See CUPL edge, 116 communicating sequential processes. generation, 109 See CSP operating frequency, 124, 134 compiler rate, 109 Impulse C, 104-108 secondary, 185 processing flow, 103 skew, 124 complex datatypes, 130 co.h, 51 concurrency, 5, 12, 38 co_architecture_create, 57, 342-343 configuration subroutine, 56, 62 co_bit_extract, 343-344 consumer, 55 co_bit_extract_u, 344-345 process, 53, 96-97 co_bit_insert, 346-347 cosim_logwindow_create, 97, 372 co_bit_insert_u, 347-348 cosim_logwindow_fwrite, 96, 372-373 co_err_already_open, 66 cosim_logwindow_init, 373-374 co_err_not_open, 66 cosim_logwindow_write, 374 co_execute, 52, 54, 348-349 Cray, Inc., 33, 301 co_initialize, 51, 104, 349-350 CSP, 39-41 co_memory_blockwrite, 221 application, 40 co_memory_create, 77, 350-351 bandwidth considerations, 40 co_memory_ptr, 351-352 idealized, 40 co_memory_readblock, 78, 352-353 programming model, 42 co_memory_writeblock, 353-354 C-to-hardware, 27 co_parameter, 109 CUPL, 21 co_par_break, 127, 354 cycle-accurate simulation, 14, 123 co_process_config, 57, 108, 355-356 co_process_create, 56, 62, 356-357 D co_register_create, 357-358 co_register_get, 358 data co_register_put, 359 dependencies, 139 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 421

Index 421

bandwidth, 14, 40 PowerPC, 25, 48, 167, 311, 317 Data General Eclipse, 18 soft versus hard, 310 Data I/O Corporation, 21 encryption, DES, 147 data movement, analyzing, 147 end-of-stream, 67 dataflow, 14, 42, 65 examples datatype, 59 DES encryption, 147-161, 168-208 DDR, 317 FIR filter, 87-101 deadlocks, 69 fractal object generation, 280-299 and pipelining, 72-73 HelloFPGA, 50-57 avoiding, 69-73 image filter, 209-255, 259-277 debugging SERDES interface, 326-332 hardware, 119-125 uClinux, 259-277 DES encryption, 149 exporting design from the Impulse tools, 183 methods, 7 generated hardware and software, 262 prototyping, 9 desktop simulation, 60, 97 F Digital Equipment Corporation, 33 digital signal processing. See DSP Fast Simplex Link. See FSL diode matrix, 18 fetch-and-execute, 35 direct memory access. See DMA field upgrades, 23 DMA field-programmable, 2 as alternative to streaming, 222 gate array. See FPGA input process, 222-223 logic array. See FPLA transfer, 223 FIFO, 60, 64, 209, 331 DSP FIR filter applications, 8, 13 coefficients, 88 programmers, 11 consumer process, 97 dual clock, 185 expanded source code, 121 generating, 109 hardware generation, 104 dynamically reconfigurable, 4 performance, 91 computing, 25, 28, 303 producer process, 94 source code, 89 E test bench, 90 window size, 87 EDA, 12 fixed-point, 42 edge detection, 210 conversion, 285-286 EDIF netlist, 106 fixed-width integer, 42 embedded processor, 5, 163 Fletcher, Bryan H., 309 advantages of, 311-312 floating point, 130 as a test generator, 165-167 fMax, 129 disadvantages of, 312-313 FPGA, 2, 17 MicroBlaze, 25, 165, 178 as high-performance computer, 302-305 Nios, 48, 231 as parallel computing machine, 27, 35 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 422

422 Index

bitmap, 4, 191, 267 description, 14 compiler tools, 35 description language. See HDL computing, 133-134, 301-307 development, 11 embedded processor, 310 engineer, 12 MicroBlaze, 25, 165, 178 generation, 125, 156-159 Nios, 48, 231 platform, 4 PowerPC, 25, 48, 167, 311, 317 process, 41, 58, 59-63 design philosophy, xvii synchronization of, 59, 65 history of, 17 communication with, 51 netlist, 106 constraints of, 58 operating frequency, 319 prototype, 160 place-and-route, 10, 106 simulation, 91 platforms, xix, 5 synthesis, 159 synthesis, 127 hardware/software FPLA, 18 applications, 15 fractal interface, 14 accuracy of, 282 partitioning, 12 geometry, 280 solution, 13 objects, 280 Harris Semiconductor, 17 FSL, 85, 160, 165, 201, 223, 259 HDL, 4 connections, 266 generation from C, 108-109 function table, 19 simulation, 116 fusible link, 18 top-level module, 108 Hello FPGA, 50 G heterogeneous parallelism, 37 high-performance computing, 2 gate array, 2 history gate delays, 124, 127 of programmable platforms, 17 gcc, 45 Hoare, Anthony, 39 General Electric, 18 Hyperterminal, 269 Generate Options dialog, 262 generic, VHDL, 109 I geophysics, 41 Gokhale, Maya, 46 IBM, 17 grid computing, 326, 332 IDE, 156 image data H reading and writing, 213 image filter, 210 hand-crafted HDL, 13 partitioning, 219 Handel-C, 42 image processing, 41 hardware, 106 Impulse C, 45 acceleration, 311, 312, 321-322 and ANSI C, 42 accelerator, 163 datatypes, 59 analysis, 159 for streaming applications, 42 compiler, 13, 129 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 423

Index 423

library, 49, 341-373 instruction scheduling, 47, 128, 135-136 co_architecture_create, 57, 342-343 instruction stages, 139 co_bit_extract, 343-344 instruction-set simulators, 14 co_bit_extract_u, 344-345 in-system debugging, 10 co_bit_insert, 346-347 INT_TYPE, 66 co_bit_insert_u, 347-348 integer co_err_already_open, 66 datatypes, 129 co_err_not_open, 66 division, 129 co_execute, 52, 54, 348-349 integrated development environment. See co_initialize, 51, 104, 349-350 IDE co_memory_blockwrite, 221 , 17 co_memory_create, 77, 350-351 4004, 17 co_memory_ptr, 351-352 8051, 25 co_memory_readblock, 78, 352-353 Intersil Corporation, 18 co_memory_writeblock, 353-354 IP core, 25 co_parameter, 109 IPFlex, 28 co_par_break, 354 co_process_config, 57, 108, 355-356 J co_process_create, 56, 62, 356-357 co_register_create, 357-358 Java, 4 co_register_get, 358-359 JTAG, 108, 159 co_register_put, 359 co_register_read, 360 K co_register_write, 360-361 Kernighan, Brian, 50 co_signal_create, 361-362 co_signal_post, 362-363 L co_signal_wait, 74, 115 co_stream_close, 53, 113, 364 language-based design, 12 co_stream_create, 57, 64, 365-366 latency, pipeline, 126 co_stream_eos, 68, 366 Lattice, 3 co_stream_open, 53, 367-368 legacy co_stream_read, 368-369 algorithm, 36 co_stream_read_nb, 296, 369-370 C code, 147 co_stream_write, 53, 67, 370-371 programming, 14 minimal program, 50-57 Linux. See uClinux operating system motivation behind, 47 load balancing, 280 origin, 46 locality, 40 parameter, 63 log window programming model, 42, 48-50 creating, 99 simulation library, 60 initializing, 99 inline function, 129 writing to, 99 INMOS, 34 logic input rate, 126 equation, 20 input stream, 67 synthesis, 23, 106 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 424

424 Index

loop streams-oriented, 42 considerations, 199 monitoring, 99 pipelining, 47, 106, 125, 204-207 functions, 63 unrolling, 126, 141-142, 203-204 Monolithic Memories, 19 Los Alamos National Laboratories, 46 Montana State University, 326 Moore's Law, 25 M Morphics, 28 MPI, 38 machine model, 40 multicomputer, 33 MIMD, 33 multitasking, 32 shared memory, 34 multithreaded, 32 SIMD, 33 SISD, 32 N von Neumann, 32 macro interface, stream, 174-175 National Institutes of Health, 326 main function, 51 National Science Foundation, 326 Mandelbrot image generation, 279-299 DS92LV16, 326 set, 279 neural data, 326 Mandelbrot, Benoit, 280 Nios, 48 Memec, 156, 169, 259, 309 stream performance, 83 V2MB1000 board, 169, 259 Nios II, 25, 231 memory, 47, 58. See also shared memory access, impacts of, 128 O accessing, 200 block read and write, 139 O_RDONLY, 66, 115 controller, 312-313 O_WRONLY, 67, 114 for data communication, 78 obsolescence mitigation, 311 embedded, 81 Occam, 34, 36 external, 81, 310, 317 on-chip interface, 26 reducing accesses to, 139 On-chip Peripheral Bus. See OPB usage, 316 OPB, 26, 84, 317 message passing interface. See MPI timer, 187, 188 MicroBlaze, 25, 165, 178-193, 201, 223, 315 operating frequency, maximum, 124 Development Kit, 156 operating system stream performance, 85 embedded, 257-277 MIMD, 33 uClinux, 257-277 MinGW library, 94 optimization, 137-139 mixed processor design, 7 C code, 195 model expression-level, 137-139 machine, 40 level, 315 programming, 31, 32, 47 through experimentation, 323 communicating processes, 41 within basic blocks, 139 Impulse C, 41-43 optimizer operation, 135-139 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 425

Index 425

Stage Master, 135 Platform Studio, 26, 178, 264 output streams, 66 Platform Support Package, 157, 261 PLB, 84 P Posix, 60 potassium channels, 326 PACT, 28 PowerPC, 25, 48, 167, 311, 317 PAL, 19 pragma Assembler, 19 PIPELINE, 106, 125, 136, 214 PALASM, 19 StageDelay, 126, 139 parallel UNROLL, 125, 137, 203, 214 computing, 27 process, 41, 58-63 processes, 31 synchronization of, 12, 59, 65 programming, xvii, 15 understanding, 59-63 parallelism run function, 53 coarse-grained, 36 processing extreme levels of, 302 elements (PEs), 38 programming for, 36, 38 machine, 38 spatial, 303 model, 32 statement-level, 87, 133-145, 290 processor system-level, 87, 219, 290 as test generator, 37 partitioning, 14 benchmarks, 313 system-level, 219, 290 considerations, 167 PCI, 307 core, 25 Pentium, 7 embedded, 40, 309-323 peripheral hard core, 167 integration, 25 MicroBlaze, 25, 165, 178 Photoshop, 307 performance, 313 PIC processor, 25 peripherals, 312-313 picoChip, 28 soft, 37, 310 pipeline Processor Local Bus. See PLB generation, 136-137, 206 producer, 51 goal, 205-208 process, 51, 94-96 hardware size, 207 programmable loop, 204 array logic. See PAL performance, 145, 207 hardware platform, 4, 26 rate, 144-145 logic, 2, 17 system-level, 113, 219-231 origins of, 18-23 PIPELINE pragma, 106 programming pipelining, 106, 142-145 abstraction, 12-16 pixel stream, 210 model, 31, 32, 47 place-and-route, 10, 106 communicating processes, 41 platform, 4 Impulse C, 41-43 FPGA-based, xix, 4-5 streams-oriented, 42 selection, 169-170 prototyping, 7, 9 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 426

426 Index

of hardware, 16 signal rapid, 27 creating, 73-74 interface, 113 Q overview of, 73-74 posting, 74 Queensland, University of, 258 value, 74 Quicklogic, 3 wait mode, 115 Quicksilver, 28 waiting for, 74 signed type, 59 R Corporation, 18 Radiation, Inc., 18 Silicon Graphics, 301 rapid prototyping, 27 SIMD, 33 rate simulation introduction, 126 consumer process, 53 pipeline, 126 cycle-accurate, 123 , 25, 28, 303 DES encryption, 155-156 recursive function call, 129 desktop, 60 register hardware, 116 overview of, 74-76 library, 60 Register-Transfer-Logic. See RTL producer process, 51 software, 155-156 reprogrammability, 23 source-level debugging, 121 reset, 109 test bench, 50 RISC, 25, 26 tools, 10 Ritchie, Dennis, 50 VHDL, 116 RTL, 12, 49 simulator design, 9 hardware, 168 simulation, 168 SISD, 32 S Snider, Dr. Ross, 326 soft processor, 37, 48 SDRAM, 317 software secondary clock, 185 acceleration, xix SERDES process, 106 for data streaming, 327-329 simulation, 155-156 handshaking, 328 test bench, 50 initializing, 328 software-based methods, 9 synchronization pattern, 328 solution space, 12 transceiver, 329 SOPC Builder, 26, 178 serial interface, 326-327 SP box, 149, 200 serializer/deserializer. See SERDES Spartan-3, 317 shared memory, 49, 76 spatial parallelism, 303 performance considerations, 81-86 SRAM, 317 using, 76-78 stage, 121 shift operand, 130 delay, 126 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 427

Index 427

Stage Master, 126, 127 SystemC, 15, 94 StageDelay pragma, 126, 139, 142 system-level pipeline, 113, 219-231 standard processor, 8 state machine, 18 T generated from C, 120 SERDES interface, 330 target platform stdio.h, 51 limitations of, 103 stream, 63-66 technology mapping, 106 test bench, 16, 97, 155 closing, 66 embedded, 163-194 custom interface, 325-332 test fixture, 97 datatype, 67 test generator, 165 deadlocks, 69 test vector, 20 for input, 67 , 17 for output, 66 TFTP, 272 hardware, 64 Thinking Machines, 33 I/O, 65-68 thread programming, 60, 63 interface, 43, 109, 112 threads, 38 macro interfaces, 174-175 TIFF format, 259 mode timed C, 61 read, 115 tools, role of, 8-9 write, 114 Transputer, 33, 34 nonblocking, 71 TTL device, 19 opening, 54 overview of, 63-66 U parameters, 109 performance, 201-202 UART, 25 considerations, 81-86 uClinux operating system protocol, 113 FTP client, 259, 274 read mode, 115 kernel image, 269 reading, 68 overview, 257-259 write mode, 114 RAM disk, 259 Streams-C, 46 UINT_TYPE, 66 struct, 130 union, 130 structured ASIC, 7 unit test, 165 SUIF, 46 UNROLL pragma, 106, 203 supercomputing, 301 unsigned type, 59 synchronization untimed C, 61 pattern, 328 process, 12, 59, 65 V synthesis, 58 V2MB1000 board, 169 system value engineering, 7 architect, 5 VAX 11/750, 18 integration, 5, 306 , xviii, 4, 15 on a programmable chip, 25-27, 37 ZZ0_Index.qxp 3/15/2005 3:45 PM Page 428

428 Index

VHDL, xviii, 4, 15, 49 video stream, 210 Virtex-4, 167 Virtex-II, 156, 326 Virtex-II Pro, 167 Visual Studio, 45, 149, 157 von Neumann, John, 32

W Williams, John, 258 Windows bitmap format file, 282 GDI, 213 wireless communications, 41

X xil_printf, 315 , 23 EDK tools, 178, 264 MicroBlaze, 25, 48, 165, 178-193, 201, 223 Platform Studio, 262 Spartan-3, 317 Virtex-4, 167 Virtex-II, 156, 326 Virtex-II Pro, 167 Xygwin shell, 271

Z Zilog Z-80, 25