info10

2007 Compilers appears quarterly | April Architectures and Embedded High Performance Network of Excellence on

2 Message from the HiPEAC coordinator

3 Community News

4 HiPEAC Partner: The

7 Message from the project officer

7 HiPEAC Conference Ghent

8 In the Spotlight: - Modifying GCC to enable automatic tuning of optimization heuristic - MiDataSets: Creating The Conditions For a More Realistic Evaluation of Iterative Optimization Stamatis Vassiliadis, 1951-2007 8 Community News

9 PhD news

11 HiPEAC Journal

11 HiPEAC 2008 Conference

12 Upcoming events

www.HiPEAC.net

HiPEAC 2008, Göteborg, Sweden Deadline Call for Papers: June 10, 2007 intro

Mateo Valero Coordinator UPC Barcelona Message from the HiPEAC coordinator [email protected]

Dear colleagues,

This has been a hectic winter for HiPEAC. Apart from our ongoing research activity struc- tured within the clusters, HiPEAC carried out its second Conference, published its first HiPEAC Roadmap, opened two new calls to offer new opportunities to our members and of course, more events are coming up.

The HiPEAC Winter Conference 28. Forty proposals were received, that plans at the InfoDay in Brussels, on (January 28-30, 2007, Ghent, must be evaluated by April. March 7. That may be our future Belgium) welcomed more than 200 adventure. Our name? HiPEAC, too! participants. 19 papers were accepted HiPEAC keeps on fostering industry- and 6 parallel workshops were organ- academia relationships on high-per- My only regret is not to be able to ized, with 150 participants. The formance embedded systems, by share this new path with my dear Conference proceedings were pub- means of internships and industrial friend Stamatis. I have plenty of good lished by Springer LNCS and the best workshops. memories of our meetings and dinners papers will be gathered in the during these few years we worked Transactions on HiPEAC. The second internship call, closing in together, and I can say he continuous- March, will fund several company ly surprised me. A very industrious and After its second edition, the HiPEAC internships. This mechanism will allow clever guy, he loved his job and would Conference begins to consolidate as a PhD students to directly target compa- never stop working, but liked to com- major event for our community. The ny research groups on HiPEAC topics. bine business and pleasure, and he call for papers for the HiPEAC 2008 HiPEAC member companies (ARM, certainly achieved it, helping every- Conference, scheduled for January 27- IBM, Infineon, NXP and ST) have pro- body to have a good time. He made of 29 in Göteborg, Sweden, has already duced a list of the research topics for the yearly Samos Conference an been launched. which they are seeking interns. Visit opportunity for many students to our website if you want to be updated enjoy his beloved Mediterranean sea. I In January, HiPEAC published the about this and other enticing opportu- personally cannot think of any occa- HiPEAC Roadmap on high-perform- nities! sion in which we were not joking ance embedded architecture and com- while working. He liked people and pilation. The objective was to point far In April, a new HiPEAC Industrial people liked him. Sometimes he in advance at the most fruitful and Workshop has taken place in Haifa, switched from a kind of “enfant terri- high-impact research directions rele- Israel, in parallel with a general cluster ble” attitude to the most serious, col- vant to the needs of the European meeting. A call for papers opened at laborative one in few seconds. I always industry. A total of 55 key challenges, the end of January. Papers were select- suspected it was his particular way of organized in 10 themes are listed. The ed by industry researchers. dealing with long, tedious administra- roadmap process has been a real com- tive issues that usually accompany munity effort. The result is a straight to Finally, HiPEAC has already announced large projects like HiPEAC or SARC. the point paper, available on our web- ACACES 2007, our Third International Stamatis was for many of our col- site, which will periodically be updat- Summer School on Advanced leagues the "Happy Warrior" in our ed. I encourage you to read it careful- Computer Architecture and field. He was a most valuable col- ly and to send your feedback, which as Compilation, again in L’Aquila, Italy. league and friend, a very optimistic, always will be much appreciated. I positive kind of person who showed wish to thank again all HiPEAC mem- We are so enthusiastic about our great courage until the end. Stamatis, bers and people from outside our achievements at HiPEAC and about tú sabes que siempre te llevaré en mi community, who contributed to this the challenges we still have to meet, corazón. document. that we want it to be continued. The European Commission FP7 pro- The regular cluster meeting took place gramme may give us again the chance right after the Conference in Ghent. A of running an improved network from new cluster call closed on February, 2008 on. Thus, we presented our new Mateo Valero HiPEAC Coordinator 2 info10 Community News

Stamatis Vassiliadis, 1951-2007

On Saturday, April 7, 2007, after a long illness, Stamatis Vassiliadis peace- fully passed away, surrounded by his family. The HiPEAC network lost one of its best members and its ultimate discussion partner. His family lost a loving father and a devoted husband, the world of computing a great scien- tist, and our world at large a very kind and generous person.

Stamatis was born in 1951 in the village Manolates, on the island of Samos, in Greece. After his study at Politechnico di Milano, Stamatis moved to USA and worked for IBM at the Advanced Workstations and Systems laboratory in Austin, Texas, the Mid-Hudson Valley laboratory in Poughkeepsie, New York, and the Glendale laboratory in Endicott, New York. At IBM, he has been involved in a number of projects regarding com- puter design, organizations, and archi- tectures and in the leadership of advanced research projects. In this that produced many great scientists. and compilers in Europe, one of period he was awarded 73 USA He loved this island very deeply and Stamatis' dreams. patents ranking him as the top all time was returning there every summer. In IBM inventor. For his work he received addition, he made of the Samos con- Stamatis, we will always remember numerous awards including 24 ference an opportunity to enjoy his you. Publication Awards, 15 Invention beloved Mediterranean sea with so Achievement Awards and an many students and colleagues, who Outstanding Innovation Award for will always remember it as a great Engineering/Scientific Hardware experience, so different from any Design in 1989. While working for other event. IBM Stamatis also served in the ECE faculties of Cornell University, Ithaca, Stamatis always inspired all of his NY and State University of New York friends and students with his enthusi- (S.U.N.Y.), Binghamton, NY. In 1995 asm, momentum, courage, audacity, he accepted a position at TU Delft and warmth, friendliness, assistance and moved back to Europe. He made the total support. Being a warm-hearted Computer Engineering laboratory one person, he always inspired friendly Stamatis Vassiliadis of the strongest groups in the field feelings in all those around him. 1951-2007 with more that 50 PhD students from many different countries. Stamatis Stamatis was one of the earliest pio- was IEEE and ACM fellow and a mem- neers of HiPEAC, always trying to Integrity was his compass ber of the Royal Dutch Academy of strengthen our community in Europe. Science his instrument science (KNAW). We mourn his death, but are deter- Advancement of humanity his mined to continue the work on build- final goal Stamatis was very proud of his island ing the best scientific and engineering Samos, a small piece of Greek land community on computer architecture

info10 3 HiPEAC Partner The Netherlands

Scientists from the gramming long before the term was Netherlands continu- even introduced. Willem van der Poel ously made significant actually built his first relay-based com- contributions to the puter already in 1944 as student in his field of computing hobby room. throughout the years. Edgser Dijkstra, Gerrit The MC (see the photo) was founded in Blaauw and Willem van der 1946 to promote pure and applied Poel are three names that do not need mathematics. The rekenafdeling (the any additional introduction. One very calculation division) had two main tasks interesting fact is that the develop- in 1947: performing advanced calcula- ments in this field were mainly driven by tions and building a novel computing which are very popular now. the needs of the industrial development machine. The team involving Blaauw and not by military demands. (also co-architect of IBM 360) Scholten The first programmer in the and Loopstra developed ARRA I (1952), Netherlands, Edsger Wybe Dijkstra is The real boom in computing in the ARRA II (1954) and ARMAC (1956). considered as one of the most influen- Netherlands happened in the 50’s of Those machines were using proved tial members of computing science's the last century. There were three cen- methods and techniques and have been founding generation. His scientific con- ters for building computers: the Dr designed with maintainability and high tributions are fundamental in the Nether Laboratory at the Post productivity in mind. Many standard domains of: algorithm design, program- Telephone and Telegraph (PTT) in replaceable units where used (probably ming languages, program design, oper- Rotterdam (now KPN research), the contribution of G. Blaauw). The pro- ating systems, distributed processing, Mathematic Centre (MC) in Amsterdam grammer of those machines was E. formal specification and verification, (nowadays CWI -Centrum voor Dijkstra. design of mathematical arguments. Wiskunde en Informatica / Center for During his forty-plus years as a comput- Mathematics and Computer Science) At NATLAB in Eindhoven three different er scientist, at both academia and and Natuurkundig laboratorium (NAT- machines were built: PETER (1956), industry, Dijkstra's contributions LAB) in Eindhoven (now Philips and NXP STEVIN (1960) and PASCAL (1960). brought him many prizes and awards, research). They all were for internal Philips use including computing science's highest only and were designed mainly to gain honor, the ACM Turing Award in 1972 some experience with the circuits Philips (the “Nobel price” in computer sci- was delivering to IBM. ence).

Being very pragmatic, the programmers Many national and international com- in the Netherlands in the 50’s were also panies are located in the Netherlands. very confident in their models – two The two most relevant to HiPEAC topics striking examples of that time are the companies: NXP and ACE are members following. In Delft wind tunnel forces and active contributors to the develop- were calculated using a technique later ment of high performance embedded known as the finite elements method. architectures and compilers in the As there was not enough processing Netherlands and Europe. power to calculate the whole structure, only small parts of it were computed. At Dr Nether Lab, W. van der Poel and The scientists that did this job walked his team created many successful com- along the real structure under test puters in close collaboration with TU when pressure above 3.5 Bar was Delft where he built the Testudo (Turtle) applied to it for the first time (at the in 1952 from parts donated by PTT. The same time the builders took cover PTT Testudo was used by TNO between behind a concrete wall). Programmers 1952 and 1964, a long lifetime as the at MC modeled the water movement one of the animal whose name it bor- produced when a big ship was let into rows. The three Van der Poel’s water for the first time and stood very machines: ZERO (1953), PTERA (1953) close on the pier watching the real situ- and ZEBRA (1956) introduced many ation, confident to keep their feet dry. novel concepts. For example the ZEBRA The programmers in the Netherlands (see the photo showing the computer those days were living in the real world and its inventor) was using micropro- and not in any of the virtual ones,

4 info10 HiPEAC Partners

systems. More precisely, the laboratory is techniques and tools. CE is one of the actively involved in: computer architecture, founding HiPEAC partners and is a steer- machine organizations, and network pro- ing committee member. Currently 50 PhD cessing, mapping of application and - students from many different countries are The Computer Engineering (CE) laborato- rithm requirements to architectures of pursuing their research in various fields of ry, Microelectronics and Computer embedded systems (e.g. multimedia), Computer Engineering at CE. CE laborato- Engineering department at TU Delft, per- compiler technology capable of directing ry is the project coordinator of the Scalable forms research and teaches the engineer- system requirements to architectural defi- ARChitecture (SARC) FP6 Integrated ing discipline of how to determine, devel- nitions and improve implementations, Project (contract number 027648) and the op, and integrate software and hardware architectural synthesis tools for semi-auto- scientific coordinator of hArtes IP to build a computing system. The labora- matic implementation of architectures, (035143). tory focuses on the definition of system computer arithmetic and logic design, requirements, from embedded to general algorithms and tools for testing memories, People: purpose, their architecture and implemen- built in self-test of logic circuits, and auto- Stamatis Vassiliadis, Georgi Gaydadjiev, tations, and the study and development of matic test pattern generation for combina- Koen Bertels, Mladen Berekovic, tools and software that allow improving tional and sequential logic circuits, per- Ben Juurlink and Sorin Cotofana the analysis and synthesis of computing formance modelling and optimisation URL: http://ce.et.tudelft.nl

The Computer Systems duced the concept of microthreads. very early stages of design, where design Architecture (CSA) group from Microthreading is an execution model that decisions have great impact on (the suc- the University of breaks code down into fragments that can cess of) the final product. To this end, we Amsterdam performs execute simultaneously, on a single micro- study both analytical modeling methods as research in the fields of scalable, instruc- threaded microprocessor or distributed well as simulation methods for system- tion-level parallel architectures as well as over different processors. Within the level (performance) analysis. These meth- system-level computer architecture model- Aether project, we extend the above con- ods and techniques are incorporated in ling and simulation. In our MicroGrid proj- cepts to develop computer systems that our Sesame simulation framework for sys- ect, we are developing a novel approach support self-adaptation in their software, tem-level DSE. to micro-architecture that supports mas- architecture and implementation. sive on-chip concurrency, which is scalable, People: flexible and amenable to analysis. It has In addition, our group also investigates Chris Jesshope, Andy Pimentel and Peter the potential to provide for the manage- more generic methods and techniques for Knijnenburg ment of on-chip resources (processors etc.) system-level design and analysis of (future) URL: so as to autonomously configure a system system-on-chip based computer architec- http://www.science.uva.nl/research/csa/ for performance, power dissipation or tures. This work focuses on architectural fault tolerance. To this end, we have intro- design space exploration (DSE) during the

NXP is a new independent semiconduc- formal Philips divisions) and inherits more tion for e-passports, Digital cordless tor company (founded by Philips) with a than 50 years of experience in semicon- chips, FM radio ICs for mobile, fifty-year history of providing engineers ductors. The company headquarters are GSM/GPRS/EDGE system solutions, and designers with semiconductors and in Eindhoven, the Netherlands. The net Interface products, Mobile speaker sys- software that deliver better sensory sales in 2005 amounted € 4.77 billion. tems, Near Field Communication, PC TV experiences for mobile communications, The company employs approximately chips, RF products for CATV and satellite consumer electronics, security applica- 37,000 people in more than 20 countries tuners, RFID for electronic ticketing in tions, contactless payment and connec- and has more than 24 R&D centers public transport, System solutions for tivity, and in-car entertainment and net- world-wide. NXP has 10 wafer fabs and automotive immobilizers and keyless working. 8 test and assembly sites spread in differ- entry/go, TV chips and USB. ent countries. Company Customers Building on its heritage in consumer include Apple, Bosch, Dell, Ericsson, NXP current focus markets are: Mobile & research, significant R&D investment and Flextronics, FoxConn, Nokia, Philips, Personal, Home, Identification, world-class industry partners, NXP's Samsung, Siemens and Sony. Automotive, Multimarket Semiconduc- "vibrant media technologies" allow con- tors and Software. sumers to enjoy better sensory experi- NXP holds Worldwide number one posi- ences - brilliant images, crisp clear sound tions in: 5-V CMOS logic products for the People: and easy sharing of information in automotive industry, Automotive In- Marc Duranton, Jan Hoogerbrugge homes, cars and mobile devices. Vehicle Networks, Car radio Digital URL: http://www.nxp.com NXP was established in 2006 (from some Signal Processors, Contactless identifica-

info10 5 HiPEAC Partners

(processors) in multiple clusters or sub- adding support for TV distribution (both systems to single jobs. We study grid live and recorded) to BitTorrent, which is scheduling by actually designing, imple- one of the most popular p2p systems. menting, and deploying a working grid Among the most important features we The Parallel and Distributed Systems scheduler (called KOALA) in the Dutch concentrate on are IP support for the group (PDS) in the Department of National Research Grid system (the notion of friends, an efficient gossip- Software Technology of Delft DAS). based algorithm for doing recommen- University of Technology (TU Delft) dations for content, support for focuses on the following research top- High performance computing has as improved download performance and ics: grid computing, parallel program- focus research in parallel languages and streaming videos across p2p systems, ming models, peer-to-peer systems, and programming environments, more and decentralization of a number of sensor networks. specifically in languages and compila- mechanisms. tion techniques for distributed memory The research in grid computing focuses architectures such as multi-core sys- In the wireless sensor networks area, on the problem of scheduling in multi- tems. We focus on HPC-extensions to we focus on the development of new cluster systems and grids. The subsys- Java (SPAR) and compilers that semi- protocols and algorithms for the effi- tems making up a grid are to a large automatically generates code for dis- cient management of the resource- and extent autonomous, since grids are tributed memory systems. Another area energy-limited sensor nodes making up often heterogeneous and also may of focus is stream processing, which such networks. Prototype deployments exhibit many failures. Therefore, sched- stems from consumer electronics appli- have to establish the feasibility of such uling in grids is highly non-trivial. In cations and scientific applications application domains. addition, when jobs only employ where streams are generated by sensors resources in a single grid subsystem, (e.g. radio telescopes). People: grid schedulers are not much more than Henk Sips load-balancing devices. For this reason, The research in peer-to-peer networks URL: http://www.pds.ewi.tudelft.nl we focus on the problem of co-alloca- focuses on adding social features such tion, i.e., on allocating resources as friends and taste buddies, and

- and the heavy demand pull from sig- ing these challenges, and we gladly nal processing applications, in particular adopt their results and tools to come to multimedia and telecommunications, full trajectories and innovative proces- The mission of the section Electronic requires rigorous and robust answers. sors. Our contributions come from tack- Systems at (TU/e) is to provide a sci- Algorithms play a key role here, and ling the fundamental problems and fill- entific basis for design trajectories of with a dual nature: ing the essential gaps, revealed by care- digital electronic circuits and systems - they still form the basis for effectively ful analysis of the methodological 'from (generalized) algorithm to realiza- using a computer in design assistance, scene, using our insight in video pro- tion'. To identify the key problems, and so in the first instance we want to sup- cessing and our experience with design verify the validity, robustness and com- port or develop algorithms for synthesis environments. pleteness of our results, we develop, and verification of complex integrated implement and maintain consistent and systems where we do not stop at the The chair's expertise is firmly rooted in complete flows, and use them for real- level of point-wise solutions to specific industrial research of video processing, izing innovative multimedia hardware problems, but integrate them into com- multiprocessor architectures and design with emphasis on video processing and plete design environments: this is the automation for deep submicron VLSI, embedded architectures. methodological challenge. and complemented with a solid theo- Implied in the mission statement is the - they are the core of signal processing, retical basis in combinatorics and question of how to convert the "art" of and since video processing is our major process algebra. This makes a distinc- designing electronic systems into application area, we aim at discovering tively design-oriented group, which methodology. This is an absolute neces- effective algorithms to treat video sig- aims to push ideas so far that they are sity because: nals in multimedia systems. implemented and used in design pro- - the complexity of modern integrated We like to emphasize our generalized grams and in major applications. circuits continues to increase, view on algorithms and more specifical- - new physical phenomena at submi- ly our view towards computational net- People: cron feature dimensions are having works, that is graphs with computa- Ralph Otten, Henk Corporaal, Twan more and more impact, not only on tions at the nodes and transfer proto- Basten, Bart Mesman, Jeroen Voeten performance, but even on the function- cols on the arcs. Of course, groups at URL: http://www.es.ele.tue.nl/ ality other universities and industries are fac-

6 info10 Mercè Griera i Fisa Merce. Message from the project officer [email protected] Computing Systems Call support the competitiveness of industri- March an closes on 8 May at 17:00 al strongholds such as consumer elec- Information tronics, telecoms or medical systems. As Day in Brussels a consequence: at the In the first call of FP7, there are 25 M€ Commission reserved for research on Computing premises. A Systems. Proposals in this area must • Appropriate industry participation is considerable address one of the following domains: to be proposed by all consortia number of • Novel architectures, the correspon- • All STREPs are to be build with a sys- people from ding system-level software and pro- tems approach in mind and this is to be the HiPEAC gramming environments for on-chip reflected in both the consortium com- community multi-core computing systems. position and the workplan structure. attended the • Reference designs /architectures for • Hardware and software are to be meeting. You will find the slides of the generic embedded platforms cutting considered together in all proposals. different presentations at: across application domains, accompa- http://cor nied by appropriate tools and compo- Details on the call are at dis.europa.eu/ist/embedded/in nent libraries. http://cordis.europa.eu/fp7/ict/ foday-070307.htm There are 5 M€ earmarked for a NoE In addition to the Work Programme, I covering the first domain and the rest is would like to recommend that you read If you have any questions related to the for STREPs. the "guide for applicants" in detail, Call content and on the new rules and because being it the first call of FP7, procedures do not hesitate to contact The challenge in this area is to strength- there is a need for understanding of the me. en Europe's position as a leading sup- new rules and procedures. To facilitate Mercè Griera i Fisa (Merce.Griera-i- plier of computing systems in order to this work we organized on the 7th of [email protected])

HiPEAC Conference Ghent January 28-30 2007

info10 7 In the Spotlight

Modifying GCC to enable automatic tuning of optimization heuristic

Current innovations in science and indus- Interactive Compilation Interface (ICI) to ins. We believe these modifications will try demand ever-increasing computing connect external optimization drivers to simplify the tuning process of new opti- resources while placing strict require- the GCC. This interface is meant to facil- mization heuristics and will eventually ments on system performance, power itate the prototyping and evaluation of simplify the whole compiler design, so consumption, size, response, reliability, iterative optimization, fine-grain cus- that compiler heuristics will be learned portability and design time. However tomization and design-space exploration automatically, continuously and transpar- compilers often fail to deliver satisfactory strategies. An early design, able to pro- ently, aiding users using statistical and levels of performance on modern proces- vide non-intrusive feature extraction and machine learning techniques. sors, due to rapidly evolving hardware, meddling with heuristic's decisions, was lack/cost of expert resources, fixed and presented at the SMART'07 workshop. More information will be available at the black-box optimization heuristics, sim- Currently, we are working on a more development site: plistic hardware models, inability to fine- advanced design, incrementally modify- http://gcc-ici.sourceforge.net tune the application of transformations, ing Tree-SSA to support dynamic pass and highly dynamic behavior of the sys- reordering, structured split of analysis Grigori Fursin, INRIA Futurs, France tem. and optimization code, and a compo- nent model for passes to enable dynam- Recently, we started developing an ic linking of external optimization plug-

MiDataSets: Creating The Conditions For A More Realistic Evaluation of Iterative Optimization

Iterative optimization has become a pop- We created 20 different datasets per pro- cal iterative optimization research. ular technique to obtain improvements gram for MiBench benchmark to evalu- over the default settings in a compiler for ate this assumption and analyze the More information will be available at the performance-critical applications, such as behavior of various programs with multi- development site: embedded applications. An implicit ple datasets. This work has been pre- http://midatasets.sourceforge.net assumption, however, is that the best sented at HiPEAC'07. After resolving configuration found for any arbitrary some copyright issues, we plan to make Grigori Fursin, INRIA Futurs, France data set will work well with other data our datasets publicly available to enable sets that a program uses. more realistic benchmarking and practi-

Community news

Jose Duato was awarded the research Four HiPEAC members (UPV, , prize "Premio Rey Jaime I a las Nuevas FORTH ICS and University of Murcia) Tecnologías 2006" have joined the HyperTransport con- For his discoveries of high international sortium impact regarding traffic optimization in HyperTransport is the system area network interconnection networks, with special communications standard that delivers the impact in the domain of supercomputing highest bandwidth and lowest latency in where they have been applied to the the market. BlueGene/L supercomputer. (see http://www.hypertransport.org/ consortium/cons_members.cfm?m=3)

8 info10 PhD news

Software/Hardware Techniques for Mesh Compression in Computer Graphics

By Paula Novío ([email protected]), applications, such as triangles, points and In particular, we present a compression Prof. Javier Bruguera and Prof. tetrahedra. Hardware units for decompres- algorithm for triangle meshes based on Montserrat Bóo, University: Santiago sion are also proposed. The size of the concentric strips. This algorithm achieves a de Compostela, Spain objects employed in computer graphics is high compression ratio. Firstly, the hard- June 1, 2006 continuously growing, reaching billions of ware unit for decompression is presented. primitives. This produces bottlenecks The algorithm is then extended to tetrahe- The work presented in this dissertation is caused by the lack of storage space and dral meshes. Finally, point meshes are con- focused on hardware-oriented compression limited bandwidth between the CPU and sidered, proposing two algorithms for the and decompression algorithms for primi- the GPU. Therefore, hardware compression compression of the geometry, obtaining tives commonly used in computer graphics algorithms are currently of great interest. good compression ratios.

Locality optimization techniques for irregular codes on multiprocessor and multithreading architectures By Juan C. Pichel ([email protected]), engineering. Their main characteristics are tance functions are established. These func- Prof. Dora Blanco, Prof. Jose C. their poor data locality and the fact that tions are then evaluated between pairs of Cabaleiro, University: Santiago de their memory accesses are unpredictable. rows (or columns) of the matrix and meas- Compostela, Spain These issues explain why for these applica- ure the locality in the irregular accesses that September 9, 2006 tions, the memory hierarchy performs poorly. these rows (or columns) address. The tech- nique is general enough since, as is shown In this work, several techniques for optimiz- The proposed techniques are based on the in this work, it can be successfully applied ing data locality in irregular codes of sparse reordering of the data structures (sparse to any sparse matrix (without limitation in matrix algebra are proposed. These matrices) that determine the locality of the its pattern) on different multiprocessor sys- approaches were applied to different paral- code under study. These reorderings are tems (both share memory and distributed lel architectures. The sparse matrix algebra guided by a locality model previously devel- memory) and multithreaded architectures. codes are present in a lot of problems from oped by our group. In this model four dis-

Request-Grant Scheduling for Congestion Elimination in Multistage Networks

By Nikolaos Chrysos results to delays not only for the which requires too much buffer space. ([email protected]), responsible packets, but for other unre- We have tested the new architecture by Prof. Manolis Katevenis, lated flows as well. This thesis intro- simulating a specific design that sus- University of Crete and FORTH, duces a request-grant scheduling tains robust operation under any num- December 2006 scheme that allows packet injection ber of congested outputs in a 1024- only after reserving all necessary buffer port, 10 Tb/s, three-stage Clos/Benes Interconnection networks suffer from space, thus eliminating head-of-line network, built using just 96 buffered congestion any time multiple inputs, (HOL) blocking. The scheme works as a crossbar chips and 1 control chip. unaware of each other's decisions, hybrid between traditional scheduling inject into the network traffic in excess of bufferless fabrics, which is too com- of some output or link capacity(ies); this plex, and providing per-flow queues,

On-chip traffic statistical analysis

By Antoine Scherrer fic occurring in a system-on-chip (SoC). have set up and developed a complete ([email protected]), In these systems, the introduction of and flexible on-chip traffic generation Dr. Tanguy Risset and Dr. Antoine networks on chip (NoC) has brought up environment that is able to replay a pre- Fraboulet, the interconnection system as a major viously recorded trace, to generate a Inria Compsys issue of the design flow. In order to pro- random load on the network, to pro- December 11, 2006 totype these NoC rapidly, fast simula- duce a stochastic traffic fitted to a ref- tions need to be done, and replacing erence trace and to take into account This PhD deals with the analysis and the components by traffic generators is traffic phases. synthesis of on-chip traffic, i.e. the traf- a good way to achieve this purpose. We

info10 9 Optimization of a parallel 3D simulator applied to the study of intrinsic parameters on HEMT devices

By Natalia Seoane maximum performance and reduce simula- study HEMT (High Electron Mobility ([email protected]), Prof. Antonio tion time. The simulation study of the intrin- Transistors). This approach constitutes a sys- Garcia Loureiro, University: Santiago sic parameter fluctuations in nanometer tem of coupled, nonlinear partial differential de Compostela, Spain devices requires full scale 3D device simula- equations that have been discretized using January 9, 2007 tions on a statistical scale and this is com- finite element methods. Domain decompo- putationally expensive. Therefore, the simu- sition methods, implemented by the Three-dimensional numerical simulation of lation technique used to study intrinsic PSPARSLIB library, were used to solve the lin- semiconductor devices is extremely parameter fluctuations has to be fast and ear systems arising from the linearisation of demanding in terms of computational time efficient, allowing the simulation of a large these equations. The 3D simulator has been because it involves complex numerical ensemble of devices in a relatively short developed for multicomputers using a schemes. The large amount of memory and period of time. Multiple Instruction Multiple Data strategy floating point operations needed necessi- A 3D parallel device simulator, based on the (MIMD) under the Single Program Multiple tate the use of parallel machines and appro- drift-diffusion (D-D) approach to the semi- Data paradigm (SPMD) and the Message priate algorithms in order to obtain the conductor transport, was implemented to Passing Interface (MPI) standard library.

Clustered VLIW Architectures: a Quantitative Approach

By Andrei Terechko clustering a VLIW processor requires a model, deals with this limitation by dedi- ([email protected]), thorough selection of an Inter-Cluster cating extra issue slots for ICC, reaching at Prof. H. Corporaal, Prof. R.H.J.M. Communication (ICC) model, which is the most a 1.74 speedup relative to the uni- Otten, Dr. P. Stravers way clustering is exposed to the ISA. Our cluster. Lowering the area and energy con- Technical University of Eindhoven, VLSI layouts and instruction scheduling sumption by 55% and 57% relative to the The Netherlands show that performance of the popular unicluster, respectively, is achieved by the February 6, 2007 copy operation model is severely limited by extended operands model. copies hampering scheduling of regular Site: http://www.terechko.net/cgi- Achieving the best characteristics from operations. The dedicated issue slots bin/moin.cgi/PhD_thesis

Processor Architecture Design for Smart Cameras

By Hamed Fatemi ([email protected]), sensors enables us to integrate process- age) will have a positive effect on the Prof. Henk Corporaal, Prof. Twan ing logic (in a single package or board) cost, power consumption, latency and Basten and Dr. Bart Mesman, on the camera, thereby creating the so- inter-processor bandwidth. The result is Technische Universiteit Eindhoven, called smart sensors. They have a SIMD a low-cost Smart Camera (so-called March 2007 data processing array driven by a con- SmartCam) solution. In this thesis, we troller. On top of this, a separate pow- investigate new opportunities and con- In many networked embedded systems, erful instruction level parallelism (ILP) or tribute to a better and more quantita- sensing with cameras is combined with general-purpose processor (GPP), is tively guided design trajectory for an processing to achieve certain communi- usually needed in embedded applica- efficient SmartCam template by consid- cation, measurement or control goals. tions for feature and object processing ering constraints such as power, per- The advent and subsequent popularity and control tasks. Integrating all this formance, and cost. of low cost, low power CMOS vision functionality (possible in a single pack-

Contributions to the design of reliable and programmable, high-performance systems: princi- ples, interfaces, algorithms and tools

By Prof. Albert Cohen, INRIA, France high power consumption. On the other language research in high-performance, March 23, 2007 hand, parallel computing practices are general purpose and embedded comput- Moore's law on semiconductors is coming nowhere close to the portability, accessibili- ing. This habilitation thesis motivates our to an end. Scaling the von Neumann archi- ty, productivity and reliability levels of sin- approach to these challenges, presents our tecture over the 40 years of the micro- gle-threaded software engineering. This main directions and results, and draws processor has led to unsustainable circuit dangerous gap translates into exciting chal- some research perspectives. complexity, very low compute-density, and lenges for compilation and programming

10 info10 HiPEAC Journal

Transactions on High-performance Embedded Architectures and Compilers

The first issue of Volume 2 contains the following papers:

Introduction to Part 1 by Per Stenstrom and David Whalley G. Keramidas, P. Xekalakis, S. Kaxiras, Recruiting Decay for Dynamic Power Reduction in Set- Associative Caches. V. Nagarajan, R. Gupta, A. Krishnaswamy, Compiler-Assisted Memory Encryption for Embedded Processors. S. Kluyskens, L. Eeckhout, Branch Predictor Warmup for Sampled Simulation through Branch History Matching. M. Bhadauria, S. A. McKee, K. Singh, G.S. Tyson, Data Cache Techniques to Save Power and Deliver High Performance in Embedded Systems. C. Hu, D.A. Jiménez, U. Kremer, Combining Edge Vector and Event Counter for Time-dependent Power Behavior Characterization.

HiPEAC 2008 Conference

The HiPEAC 2008 conference will take place in Göteborg on the west coast of Sweden on January 27-29, 2008. Göteborg is the second largest city in Sweden and it is an important academic as well as industrial center hosting Chalmers University of Technology and Göteborg University as well as Volvo, SKF, and Ericsson.

The general co-chairs of the conference are Per Stenström (Chalmers) and Michel Dubois (University of Southern California). The program co-chairs are Manolis Katevenis (University of Crete/FORTH) and Rajiv Gupta (University of Arizona). Deadline for paper submissions is June 10, 2007.

For more information: • Processor architectures http://www.hipeac.net/conference • Memory system optimization Topics of interest: • Power, performance and implementation efficient designs • Interconnection networks, networks-on-chip, network interfaces and processors • Security, dependability, and predictability support • Application specific processors and accelerators • Reconfigurable architectures • Simulation and methodology • Compiler techniques for embedded processors • Feedback-directed optimization • Program characterization and analysis techniques • Dynamic compilation, adaptive execution, and continuous profiling/optimization • Back-end code generation • Binary translation/optimization • Code size/memory footprint optimizations

info10 11 Upcoming events

ISPASS-2007, 2007 IEEE International Symposium on Performance Analysis of Systems and Software, San Jose, California, USA, April 25-27, 2007, http://ispass.org/

ACM International Conference on Computing Frontiers, Ischia, Italy, May 7-9, 2007, http://www.computingfrontiers.org/

DAC'44: 44th Design Automation Conference San Diego, California, June 4-8, 2007, http://www.dac.com/44th/index.html

ISCA: The 34th International Symposium on Computer Architecture San Diego, CA, USA, June 9-13, 2007, http://www.cse.ucsd.edu/isca2007/

PLDI 2007, Design and Implementation, San Diego, CA, USA, June 10-13, 2007, http://ties.ucsd.edu/PLDI/

ICS07: 21st ACM International Conference on Supercomputing Crowne Plaza Seattle, Seattle, WA, USA, June 16-20, 2007, http://ics07.ac.upc.edu/

7th Int'l Workshop on Worst-Case Execution Time Analysis (WCET'07) Pisa, Italy, July 3, 2007, http://www.irit.fr/wcet2007

SIES'2007: IEEE Seconda Symposium on Industrial Embedded Systems Hotel Costa da Caparica, Lisbon, Portugal, July 4-6, 2007, http://www.uninova.pt/sies2007/

ACACES 2007, Third HiPEAC Summer School, L'Aquila, Italy, July 15-20, 2007, http://www.hipeac.net/summerschool

SAMOS VII: International Symposium on Systems, Architectures, MOdeling and Simulation Samos, Greece, July 16-19, 2007, http://samos.et.tudelft.nl/samos_vii/

Euro-Par 2007 Rennes, France, 28-31 August 2007, http://europar2007.irisa.fr/

HiPEAC 2008 Conference Göteborg, Sweden, 27-29 January 2008, http://www.hipeac.net/conference

Contributions If you are a HiPEAC member and you want to contribute to this newsletter, please contact Thomas Van Parys at [email protected]

HiPEAC Info is a quarterly newsletter published by the HiPEAC network of excellence. Funded by the 6th European Framework Programme (FP6), under contract no. IST-004408. 12 info10 Website : http://www.HiPEAC.net Subscriptions: http://www.HiPEAC.net/newsletter