Making Nested Parallel Transactions Practical Using Lightweight Hardware Support

Total Page:16

File Type:pdf, Size:1020Kb

Making Nested Parallel Transactions Practical Using Lightweight Hardware Support Making Nested Parallel Transactions Practical using Lightweight Hardware Support Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Computer Systems Laboratory Stanford University {wkbaek,nbronson,kozyraki,kunle}@stanford.edu ABSTRACT General Terms Transactional Memory (TM) simplifies parallel programming by Algorithms, Design, Performance supporting parallel tasks that execute in an atomic and isolated way. To achieve the best possible performance, TM must support Keywords the nested parallelism available in real-world applications and sup- ported by popular programming models. A few recent papers have Transactional Memory, Nested Parallelism, Parallel Programming proposed support for nested parallelism in software TM (STM) and hardware TM (HTM). However, the proposed designs are still im- 1. INTRODUCTION practical, as they either introduce excessive runtime overheads or Transactional Memory (TM) [11] has been proposed as a promis- require complex hardware structures. ing solution to simplify parallel programming. With TM, program- This paper presents filter-accelerated, nested TM (FaNTM). We mers can simply declare parallel tasks as transactions that appear extend a hybrid TM based on hardware signatures to provide prac- to execute in an atomic and isolated way. TM manages all con- tical support for nested parallel transactions. In the FaNTM de- currency control among concurrent transactions. A large num- sign, hardware filters provide continuous and nesting-aware con- ber of TM implementations have been proposed based on hard- flict detection, which effectively eliminates the excessive overheads ware [9, 13], software [8, 10, 17], and hybrid [6, 7, 16] techniques. of software nested transactions. In contrast to a full HTM approach, To date, most TM systems have assumed sequential execution of FaNTM simplifies hardware by decoupling nested parallel transac- the code within transactions. However, real-world parallel appli- tions from caches using hardware filters. We also describe subtle cations often include nested parallelism in various forms including correctness and liveness issues that do not exist in the non-nested nested parallel loops, calls to parallel libraries, and recursive func- baseline TM. tion calls [19]. To achieve the best possible performance with the We quantify the performance of FaNTM using STAMP appli- increasing number of cores, it is critical to fully exploit the paral- cations and microbenchmarks that use concurrent data structures. lelism available at all levels. Several popular programming models First, we demonstrate that the runtime overhead of FaNTM is small that do not use transactions have already incorporated nested par- (2.3% on average) when applications use only single-level paral- allelism [1, 18]; TM should be extended to efficiently support the lelism. Second, we show that the incremental performance over- case of nested parallelism. head of FaNTM is reasonable when the available parallelism is A few recent papers investigated the semantics of concurrent used in deeper nesting levels. We also demonstrate that nested par- nesting and proposed prototype implementations in STM [2–4, 15, allel transactions on FaNTM run significantly faster (e.g., 12.4×) 21]. While compatible with existing multicore chips, most STM than those on a nested STM. Finally, we show how nested paral- implementations already suffer from excessive runtime overheads lelism is used to improve the overall performance of a transactional of TM barriers even for single-level parallelism [6]. To make the microbenchmark. problem worse, supporting nested parallelism solely in software may introduce additional performance overheads due to the use of complicated data structures [2, 4] or the use of an algorithm Categories and Subject Descriptors whose time complexity is proportional to the nesting depth [3]. For example, as shown in our performance evaluation, a single- D.1.3 [Programming Techniques]: Concurrent Programming – threaded, transactional version of the red-black tree microbench- parallel programming; C.1.4 [Processor Architectures]: Parallel mark runs 6.2× slower with single-level transactions and 17.0× Architectures slower with nested transactions than a non-transactional, sequential version. Nested parallel transactions in STM will remain impracti- cal unless these performance issues are successfully addressed. A recent paper investigated how to support nested parallelism Permission to make digital or hard copies of all or part of this work for in HTM [20]. However, supporting nested parallelism solely in personal or classroom use is granted without fee provided that copies are hardware may drastically increase hardware complexity, as it re- not made or distributed for profit or commercial advantage and that copies quires intrusive modifications to caches. For instance, apart from bear this notice and the full citation on the first page. To copy otherwise, to the additional transactional metadata bits in tags, the design pro- republish, to post on servers or to redistribute to lists, requires prior specific posed in [20] requires that caches are capable of maintaining mul- permission and/or a fee. ICS’10, June 2–4, 2010, Tsukuba, Ibaraki, Japan. tiple blocks with the same tag but different version IDs, and provide Copyright 2010 ACM 978-1-4503-0018-6/10/06 ...$10.00. version-combining logic that merges speculative data from multi- ple ways. Given the current trend in which hardware companies are Field Description reluctant to introduce complicated hardware components to imple- TID T ’s TID ment transactional functionality even for single-level parallelism, FV A bit vector that encodes family(T ). If a bit is set, the corresponding transaction belongs to family(T ) this hardware-only approach is unlikely to be adopted. CTID The TID of the transaction that conflicted with T To address this problem, we propose filter-accelerated, nested RSig Read signature transactional memory (FaNTM) that provides practical support for WSig Write signature nested parallel transactions using hardware filters. FaNTM extends abt If set, T has a pending abort. a baseline hybrid TM (SigTM) [6] to implement nesting-aware con- act If set, this TMB is the active TMB. flict detection and data versioning. Since hardware filters provide nackable If set, the nackable bit in outgoing memory requests is set. continuous, nesting-aware conflict detection, FaNTM effectively reduces the excessive runtime overheads of software nested trans- Table 1: State information stored in each TMB. T denotes the actions. In contrast to a full HTM approach, FaNTM simplifies transaction that is mapped on the TMB. hardware by decoupling nested transactions from caches. As a re- sult, FaNTM makes nested parallel transactions practical in terms of both performance and implementation cost. • If T writes to l, it is a conflict if there exists T 0 such that T 0 ∈ The specific contributions of this work are: readers(l) ∪ writers(l), T 0 6= T and T 0 ∈/ ancestors(T ). • We propose FaNTM, a hybrid TM system that supports nested If a committing transaction T is not a top-level transaction, its parallel transactions with low overheads. FaNTM provides read- and write-sets are merged to its parent. Otherwise (i.e., top- eager data versioning and conflict detection at cache-line gran- level), the values written by T become visible to other transactions. ularity across nested parallel transactions. If any transaction T aborts, all the changes made by T are discarded • We describe subtle correctness and liveness issues such as a and previous state is restored [14]. dirty-read problem that do not exist in the non-nested base- line TM. We also propose solutions to address the problems. 2.2 NesTM We use NesTM [3] as a proxy for a timestamp-based STM with • We quantify the performance of FaNTM across multiple use support for concurrent nesting. While it is an open research is- scenarios. First, we demonstrate that the runtime overhead sue to formally check the correctness and liveness guarantees of of FaNTM is small when applications use only single-level timestamp-based nested STMs, we use NesTM to investigate per- parallelism. Specifically, FaNTM is slower than the baseline formance differences between software and hybrid nested TMs. We hybrid TM by 2.3% on average when running STAMP appli- only provide a brief description and refer to [3] for additional in- cations. Second, we show that the incremental overhead of formation on NesTM. FaNTM for deeper nesting is reasonable. We also show that NesTM [3] extends an eager variant of TL2 [8] to support con- nested transactions on FaNTM run significantly faster (e.g., current nesting. NesTM uses a global version clock to establish se- 12.4×) than those on a nested STM. Finally, we demonstrate rializability. Each memory word is associated with a version-owner how FaNTM improves the performance of a transactional lock that simultaneously encodes the version and owner informa- microbenchmark using nested parallelism. tion. Transactional metadata and barriers are extended to imple- The rest of the paper is organized as follows. Section 2 reviews ment nesting-aware conflict detection and data versioning. the semantics of concurrent nesting and TM systems. Section 3 Since all the nesting-aware transactional functionality is solely presents FaNTM. Section 4 discusses subtle correctness and live- implemented in software, NesTM introduces substantial runtime ness issues. Section 5 quantifies the performance of FaNTM. Sec- overheads to nested transactions.
Recommended publications
  • Spreading Excellence Report
    HIGH PERFORMANCE AND EMBEDDED ARCHITECTURE AND COMPILATION Project Acronym: HiPEAC Project full title: High Performance and Embedded Architecture and Compilation Grant agreement no: ICT-217068 DELIVERABLE 3.1 SPREADING EXCELLENCE REPORT 15/04/2009 Spreading Excellence Report 1 1. Summary on Spreading Excellence ................................................................................ 3 2. Task 3.1: Conference ........................................................................................................ 4 2.1. HiPEAC 2008 Conference, Goteborg ......................................................................... 4 2.2. HiPEAC 2009 Conference, Paphos ........................................................................... 10 2.3. Conference Ranking .................................................................................................. 16 3. Task 3.2: Summer School .............................................................................................. 16 th 3.1. 4 International Summer School – 2008 ................................................................... 16 th 3.2. 5 International Summer School – 2009 ................................................................... 21 4. Task 3.3: HiPEAC Journal ............................................................................................ 22 5. Task 3.4: HiPEAC Roadmap ........................................................................................ 23 6. Task 3.5: HiPEAC Newsletter ......................................................................................
    [Show full text]
  • A Compiler-Compiler for DSL Embedding
    A Compiler-Compiler for DSL Embedding Amir Shaikhha Vojin Jovanovic Christoph Koch EPFL, Switzerland Oracle Labs EPFL, Switzerland {amir.shaikhha}@epfl.ch {vojin.jovanovic}@oracle.com {christoph.koch}@epfl.ch Abstract (EDSLs) [14] in the Scala programming language. DSL devel- In this paper, we present a framework to generate compil- opers define a DSL as a normal library in Scala. This plain ers for embedded domain-specific languages (EDSLs). This Scala implementation can be used for debugging purposes framework provides facilities to automatically generate the without worrying about the performance aspects (handled boilerplate code required for building DSL compilers on top separately by the DSL compiler). of extensible optimizing compilers. We evaluate the practi- Alchemy provides a customizable set of annotations for cality of our framework by demonstrating several use-cases encoding the domain knowledge in the optimizing compila- successfully built with it. tion frameworks. A DSL developer annotates the DSL library, from which Alchemy generates a DSL compiler that is built CCS Concepts • Software and its engineering → Soft- on top of an extensible optimizing compiler. As opposed to ware performance; Compilers; the existing compiler-compilers and language workbenches, Alchemy does not need a new meta-language for defining Keywords Domain-Specific Languages, Compiler-Compiler, a DSL; instead, Alchemy uses the reflection capabilities of Language Embedding Scala to treat the plain Scala code of the DSL library as the language specification. 1 Introduction A compiler expert can customize the behavior of the pre- Everything that happens once can never happen defined set of annotations based on the features provided by again.
    [Show full text]
  • (URMD) Grad Cohort Workshop
    MARCH 16-17, 2018 SAN DIEGO GRAD Cohort UnderrepresentedURMD Minorities & Persons with Disabilities 2018 www.cra.org Dear Grad Cohort Participant, We welcome you to the 2018 CRA Grad Cohort Workshop for Underrepresented Minorities + Persons with Disabilities 2018 (URMD)! The next few days are filled with sessions where 25 senior computing researchers and professionals will be sharing their strategies and experiences to help increase GradCohort your graduate school and career success. There will also be plenty of opportunities to meet and network with these successful researchers as well as with graduate students from other universities. We hope that you will take the utmost advantage of this unique experience by actively participating in discussions, developing peer networks, and building mentoring relationships. Since this is the inaugural URMD Grad Cohort Workshop, we are especially interested in hearing your feedback about the program and the experience. Please take time to complete the evaluation form provided after the workshop. We want to learn what you liked and did not like, as well as any suggestions you might have for improving the event in subsequent years. The 2018 CRA-URMD Workshop is made possible through generous contributions by the Computing Research Association, National Science Foundation, AccessComputing, Whova, Google and Association for Computing Machinery. Please join us in thanking them for their kind support. We hope that you take home many new insights and connections from this workshop to help you succeed in
    [Show full text]
  • Entrepreneurship Opportunities & Skills
    Entrepreneurship Opportunities & Skills Kunle Olukotun Stanford University 2019 CRA URMD Grad Cohort Workshop Kunle Olukotun • Professor of EE and CS at Stanford since 1992 – Cadence Design endowed chair • Computer architecture and parallel computing • Pioneer in CMP (multicore) processor design – Stanford Hydra project • Parallel programming for all programmers – Director of Pervasive Parallelism Lab (PPL) – High-performance Domain Specific Languages (DSLs) • Data Analytics for What’s Next (DAWN) – Democratizing machine learning Academic Research and Startups • When you can’t find an industrial partner for your ideas make one • Technology transfer by PhD – Understand the initial market for your product (technology) – What is your unfair advantage – More robust technology than research prototypes – Cost of schedule slip is much worse – You don’t have much time for research at a start-up • Pick your team wisely – Technical team (lab partners?) – Management team (who is making decisions) – Cap table • Funding issues – Investors (smart vs. dumb) – Angels vs Venture Capital firms Startup Pros/Cons Pros Cons • Don’t have to convince existing company to change • Have to convince investors course (until exit) (repeatedly) • Have to build a whole company, not just a development team • Finance, sales, marketing, … • Limited resources • Impatient capital Afara WebSystems • Founded in 1999 – Height of internet boom – Large web sites running out of power and space – Goal: Revolutionize internet data centers (multi-B $ market) – Goal: Approach: 10x
    [Show full text]
  • Implementing and Evaluating Nested Parallel Transactions in Software Transactional Memory
    Implementing and Evaluating Nested Parallel Transactions in Software Transactional Memory Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Computer Systems Laboratory Stanford University Stanford, CA 94305 {wkbaek,nbronson,kozyraki,kunle}@stanford.edu ABSTRACT 1. INTRODUCTION Transactional Memory (TM) is a promising technique that sim- Transactional Memory (TM) [13] has surfaced as a promising plifies parallel programming for shared-memory applications. To technique to simplify parallel programming. TM addresses the dif- date, most TM systems have been designed to efficiently support ficulty of lock-based synchronization by allowing programmers to single-level parallelism. To achieve widespread use and maximize simply declare certain code segments as transactions that execute performance gains, TM must support nested parallelism available in an atomic and isolated way with respect to other code. TM takes in many applications and supported by several programming mod- responsibility for all concurrency control. The potential of TM els. has motivated extensive research on hardware, software, and hy- We present NesTM, a software TM (STM) system that supports brid implementations. We focus on software TM (STM) [8,11,19], closed-nested parallel transactions. NesTM is based on a high- because it is the only approach compatible with existing and up- performance, blocking STM that uses eager version management coming multicore chips. and word-granularity conflict detection. Its algorithm targets the Most TM systems, thus far, have assumed that the code within state and runtime overheads of nested parallel transactions. We a transaction executes sequentially. However, real world applica- also describe several subtle correctness issues in supporting nested tions often include the potential for nested parallelism in various parallel transactions in NesTM and discuss their performance im- forms such as nested parallel loops, recursive function calls, and pact.
    [Show full text]
  • Ecpe Connections
    alumni newsletter fall 2 011 volume 28 issue 1 ecpe connections alum helps bring trees to 9/11 memorial plaza more inside ... • computer engineering student attends google camp • faculty makes breakthroughs in bandwidth recycling • alum becomes new ecpe external advisory board leader www.ece.iastate.edu learn invent impact department news letter from the chair in this issue am pleased to be able to share exciting news about the growth department news ........................ 3-6 and continued success of the Department of Electrical and Com- new faculty and staff • distinguished lecture puter Engineering (ECpE) and report on the accomplishments of I series • new bio-electrical engineering our students, faculty and staff, and alumni, as I begin my second course • faculty and staff recognitions • year as the Palmer Department Chair in Electrical and Computer new online graduate certificate Engineering. With 1,259 undergraduate students enrolled in our electrical, alumni news .............................. 7-10 computer, and software engineering programs, our department is murray harpole’s legacy remembered • in contributing to Iowa State University’s enrollment memoriam • new leader of external advisory Fall Student Enrollment being at an all-time high this fall. Our graduate board • advisory board members • alumni programs also are maintaining strong enrollment, with awards • alum helps bring trees to 9/11 335 students in master’s and PhD programs. Graduate memorial plaza We continue to develop our curriculum to meet Students: 335 the needs of our growing population, and have student news ......................... 12-14 introduced a new bio-electrical engineering course for Undergraduate information assurance student group • wind undergraduates and a new online graduate certificate Students: 1,259 energy research summer program • student in nondestructive evaluation (page 6).
    [Show full text]
  • Kunle Olukotun Cadence Design Systems Professor and Professor of Electrical Engineering
    Kunle Olukotun Cadence Design Systems Professor and Professor of Electrical Engineering CONTACT INFORMATION • Administrative Contact Kathy Robinson - Administrative Associate Email [email protected] Tel (650) 723-1430 Bio BIO Kunle Olukotun is the Cadence Design Systems Professor in the School of Engineering and Professor of Electrical Engineering and Computer Science at Stanford University. Olukotun is well known as a pioneer in multicore processor design and the leader of the Stanford Hydra chip multiprocessor (CMP) research project. Olukotun founded Afara Websystems to develop high-throughput, low-power multicore processors for server systems. The Afara multicore processor, called Niagara, was acquired by Sun Microsystems. Niagara derived processors now power all Oracle SPARC-based servers. Olukotun currently directs the Stanford Pervasive Parallelism Lab (PPL), which seeks to proliferate the use of heterogeneous parallelism in all application areas using Domain Specific Languages (DSLs). ACADEMIC APPOINTMENTS • Professor, Electrical Engineering • Professor, Computer Science • Faculty Affiliate, Institute for Human-Centered Artificial Intelligence (HAI) • Member, Wu Tsai Neurosciences Institute HONORS AND AWARDS • Fellow, ACM (2007) • Fellow, IEEE (2007) PROFESSIONAL EDUCATION • PhD, Michigan (1991) LINKS • Personal Site: http://arsenalfc.stanford.edu/kunle/ Teaching COURSES 2021-22 Page 1 of 2 Kunle Olukotun http://cap.stanford.edu/profiles/Oyekunle_Olukotun/ • Digital Systems Design Lab: EE 109 (Spr) • Parallel Computing: CS 149
    [Show full text]
  • Energy-Efficient Abundant-Data Computing: the N3XT 1,000X
    COVER FEATURE REBOOTING COMPUTING Energy-Efficient Abundant-Data Computing: The N3XT 1,000× Mohamed M. Sabry Aly, Mingyu Gao, Gage Hills, Chi-Shuen Lee, Greg Pitner, Max M. Shulaker, Tony F. Wu, and Mehdi Asheghi, Stanford University Jeff Bokor, University of California, Berkeley Franz Franchetti, Carnegie Mellon University Kenneth E. Goodson and Christos Kozyrakis, Stanford University Igor Markov, University of Michigan, Ann Arbor Kunle Olukotun, Stanford University Larry Pileggi, Carnegie Mellon University Eric Pop, Stanford University Jan Rabaey, University of California, Berkeley Christopher Ré, H.-S. Philip Wong, and Subhasish Mitra, Stanford University Next-generation information technologies will process unprecedented amounts of loosely structured data that overwhelm existing computing systems. N3XT improves the energy efficiency of abundant-data applications 1,000-fold by using new logic and memory technologies, 3D integration with fine-grained connectivity, and new architectures for computation immersed in memory. 24 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/15/$31.00 © 2015 IEEE he rising demand for high- new system technology that promises enabled by low-temperature performance IT services with to breathe new life into computing. layer transfer techniques. This human-like interfaces is driv- Key N3XT components include the unique approach decouples ing the quest for the next gen- following: high-temperature nanoma- Teration of energy-efficient computers. terial synthesis (to achieve These computers will operate on abun- › High-performance and energy- high- quality materials) from dant data that can be highly unstruc- efficient field-effect transistors low-temperature monolithic 3D tured and often streamed in terabytes. (FETs) based on atomic-scale integration.
    [Show full text]
  • News from SCS Networks
    SCS Newsletter | March 2016 News from SCS Networks IN THIS ISSUE AUSTECH 2015, THE FIRST AUST INTERNATIONAL CONFERENCE News from IN TECHNOLOGY 01 SCS Networks Mamadou Kaba Traoré Upcoming Conferences Department of Mathematics & Computer Science 03 African University of Science & Technology (AUST) The African University of Science and Technology (AUST) International Conference in Technology 2015 (AUSTECH’15) was conducted October 12-14, 2015, on AUST campus at Abuja, Nigeria. AUSTECH is an annual event by AUST which focuses on current developments in Engineering technologies, scientific and industrial applications for development in Sub-Saharan Africa. It has a multi-Conference format, with 3 symposia reflecting the core technological areas developed at AUST: • Petroleum Engineering Symposium (PES), chaired this year by D. OGBE, AUST (Nigeria). • Materials Science and Engineering Symposium (MSES), chaired this year by P. ONWUALU, AUST (Nigeria). • Computer Science Symposium (CSS), chaired this year by M.K. TRAORE, AUST and Blaise Pascal University (France), who also chaired the overall multi-conference. AUSTECH’15 program included a world-class selection of invited speakers and panels. It also promoted a Ph.D. Colloquium and Poster sessions, fora where opportunity was given to students to showcase their work in progress and link with their peers. Many people deserve credits for the success of AUSTECH’15. This includes the organization teams in the symposia and the overall multi- 2 conference, the authors, the reviewers, the panelists, and attendees who contributed to memorable cross-fertilization. The conference has Theme 2: MOBILE & TELECOM been sponsored by the ACBF (African Capacity Building Foundation). • Advances in Video Production, Application Generation and A special emphasis has to be done here on the Computer Science Telecommunications, contribution of Michael Adeyeye, University of Symposium, which focused on Theories, Technologies and Applications West Australia.
    [Show full text]
  • Multicore Cpus: Processor Proliferation - IEEE Spectrum 2/15/11 1:51 PM
    Multicore CPUs: Processor Proliferation - IEEE Spectrum 2/15/11 1:51 PM SEMICONDUCTORS / PROCESSORS FEATURE Multicore CPUs: Processor Proliferation From multicore to many-core to hard-to-describe-in-a-single-word core By SAMUEL K. MOORE / JANUARY 2011 This is part of IEEE Spectrum's special report: Top 11 Technologies of the Decade Back in 1994, programmers figured that whatever code they wrote would run at least 50 percent faster on a 1995 machine and 50 percent faster still on a '96 system. Coding would continue as it always had, with instructions designed to be executed one after the other. But Kunle Olukotun, then a newly minted professor of electrical engineering at Stanford, saw that the party couldn't go on forever. The microprocessors of the day couldn't scale up as efficiently as you'd expect through the mere addition of ever more and ever faster transistors, the two things that Moore's Law provided. To solve that problem, Olukotun and his students designed the first general-purpose multicore CPU. This idea, more than any other in the past decade, is what has kept the semiconductor industry climbing the Moore's Law performance curve. Without multicore chips, the computing capability of everything from Illustration: Frank Chimero servers to netbooks would not be much better than it was a decade ago. Everyone's happy—except perhaps for the programmers, who must now write code with threads of instructions that must be executed together—in pairs, quartets, or even larger groupings. It's not that old, single-core CPUs weren't already doing some parallel processing.
    [Show full text]
  • University of Copenhagen
    Graph Processing on GPUs A Survey Shi, Xuanhua; Zheng, Zhigao; Zhou, Yongluan; Jin, Hai; He, Ligang; Liu, Bo; Hua, Qiang- Sheng Published in: A C M Computing Surveys DOI: 10.1145/3128571 Publication date: 2018 Document version Peer reviewed version Citation for published version (APA): Shi, X., Zheng, Z., Zhou, Y., Jin, H., He, L., Liu, B., & Hua, Q-S. (2018). Graph Processing on GPUs: A Survey. A C M Computing Surveys, 50(6), [81]. https://doi.org/10.1145/3128571 Download date: 29. Sep. 2021 0 Graph Processing on GPUs: A Survey Xuanhua Shi, Services Computing Technology and System Lab/Big Data Technology and System Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China Zhigao Zheng, Services Computing Technology and System Lab/Big Data Technology and System Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China Yongluan Zhou, Department of Computer Science, University of Copenhagen, Denmark Hai Jin, Services Computing Technology and System Lab/Big Data Technology and System Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China Ligang He, Department of Computer Science,University of Warwick,United Kingdom Bo Liu, Services Computing Technology and System Lab/Big Data Technology and System Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China Qiang-Sheng Hua, Services Computing Technology and System Lab/Big Data Technology and System Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China In the big data era, much real-world data can be naturally represented as graphs.
    [Show full text]
  • 10003.Demeyerromain.2604.Pdf (0.1
    Technical Communications of the International Conference on Logic Programming, 2010 (Edinburgh), pp. 248–254 http://www.floc-conference.org/ICLP-home.html PROGRAM ANALYSIS TO SUPPORT CONCURRENT PROGRAMMING IN DECLARATIVE LANGUAGES ROMAIN DEMEYER University of Namur - Faculty of Computer Science Rue Grandgagnage 21, 5000 Namur (Belgium) E-mail address: [email protected] Abstract. In recent years, manufacturers of processors are focusing on parallel archi tectures in order to increase performance. This shift in hardware evolution is provoking a fundamental turn towards concurrency in software development. Unfortunately, de veloping concurrent programs which are correct and efficient is hard, as the underlying programming model is much more complex than it is for simple sequential programs. The goal of this research is to study and to develop program analysis to support and improve concurrent software development in declarative languages. The characteristics of these lan guages offer opportunities, as they are good candidates for building concurrent applications while their simple and uniform data representation, together with a small and formally defined semantics makes them welladapted to automatic program analysis techniques. In our work, we focus primarily on developing static analysis techniques for detecting race conditions at the application level in Mercury and Prolog programs. A further step is to derive (semi) automatically the location and the granularity of the critical sections using a datacentric approach. 1. Introduction and Problem Description Since the mid-70s, the power of the microprocessor, which is the basic component of the computer responsible for instruction execution and data processing, has increased constantly. For decades, we have witnessed a dramatic and continuous growth of clock speed, which is one of the main factors determining the performance of processors [Olu05].
    [Show full text]