A Highly Optimized Arithmetic Software Library and Hardware Co

A Highly Optimized Arithmetic Software Library and Hardware Co

TETRACOM: Technology Transfer in Computing Systems FP7 Coordination and support action to fund 50 technology transfer projects (TTP) in computing systems. FP7 Coordination and Support Action to fund 50 technology transfer projects (TTP) in computing systems. This project has received funding from the European Union’s Seventh Framework Programme for research, This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement n⁰ 609491. technological development and demonstration under grant agreement n⁰ 609491. A Highly Optimized Arithmetic Software Library and Hardware Co-processor IP for Fixed-Point VLIW-SIMD Processor Architectures Lukas Gerlach, Stephan Nolting, Holger Blume and Guillermo Payá Vayá, Leibniz Universität Hannover, Germany Hans‐Joachim Stolberg and Carsten Reuter, videantis GmbH, Hannover, Germany TTP Problem Performance requirements are pushing the limits of Area and energy efficiency is restricted for embedded multimedia systems: embedded multimedia systems: Often used non-linear complex Area and energy optimized computation by mathematical functions require a lot of using specific arithmetic evaluation computational power. software libraries or hardware accelerators. sin() atan() ln() div() cos() sqrt() exp() pow() TTP Solution Software-based solution: Hardware-based solution: Mathematic software CORDIC (Coordinate CORDIC processing element config x,y,z start Rotation Digital Computer) library (LibARITH) N optional Scalable co-processor Pre-processing stage architecture: Optimized for VLIW-SIMD processors: scale . M CORDIC modules in . Exploiting data and instruction level parallelism series are incorporated to Register Register Register Iteration controller process M CORDIC Advantages: D S iterations per clock cycle CORDIC Scale factor M table . High flexibility Angle P P table P . High accuracy CORDIC Data level parallelism (SIMD) . Fast computation compared to other approximation algorithms optional is supported. Scale fac . Reduced memory requirement compared to look-up-table interpolation Register Register Register controller config scale Maximal relative error per number of CORDIC iterations for 32 bit fix point values Post-processing stage 10 N 1 x,y,z 0,1 CORDIC co-processor area with and without SIMD support for different 0,01 number of CORDIC modules 0,001 30000 25000 ] 0,0001 2 20000 m 0,00001 μ 15000 w/o SIMD 0,000001 10000 Area [ Area SIMD Maximal relative error 0,0000001 5000 1E-08 0 M=1 M=2 M=4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 M CORDIC modules CORDIC iterations Synthesized with a 40 nm low-power technology for an operating frequency of 200 MHz. TTP Impact Hyperbolic and trigonometric operations (32-bit) sin() cos() atan() div() exp() ln() sqrt() pow() 55+30+32+18 (*) 55+30+32+18 (*) 55+10+32+5 (*) 55+17+32+9 (*) 55+28+32+17 (*) 55+15+32+12 (*) 55+18+32+9 (*) 51+49+32+27 (*) HW VLIW-SIMD+CORDIC w/o SIMD =135 =135 =102 =113 =121 =114 =114 =159 Cycles SW VLIW-SIMD (SW CORDIC) 55+30+408+53 (*) 55+30+408+18 (*) 55+10+408+5 (*) 55+17+408+9 (*) 55+28+408+17 (*) 55+15+408+12 (*) 55+18+408+9 (*) 51+49+408+27 (*) =546 =511 =478 =489 =508 =490 =490 =535 SW TI TMS320C6748 2474 1423 3759 152 2557 2721 311 4176 (*) Notation for VLIW: table-configuration + pre-processing + CORDIC-core-iterations + post-processing • The CORDIC-co-processor with SIMD support computes multiple independent results within the same number of CORDIC-core-iterations. • The number of CORDIC modules linearly decreases the needed number of CORDIC-core-iteration cycles for the same precision. • The area of the CORDIC co-processor without SIMD accounts for about 7% of the area of a reference VLIW-SIMD processor architecture. TTP Facts Contact: Jun.‐Prof. Dr.‐Ing. Guillermo Payá Vayá E‐mail: [email protected]‐hannover.de TETRACOM contribution: 25.000 € Duration: 1/1/2016‐31/7/2016 TETRACOM coordinator: Prof. Rainer Leupers, [email protected]‐aachen.de http://www.tetracom.eu | @TetracomProject.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    1 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us