Multi-Label Classification for Automatic Tag Prediction in the Context of Programming Challenges

Total Page:16

File Type:pdf, Size:1020Kb

Multi-Label Classification for Automatic Tag Prediction in the Context of Programming Challenges Multi-label Classification for Automatic Tag Prediction in the Context of Programming Challenges Bianca Iancu Gabriele Mazzola Del University of Technology Del University of Technology Del, South Holland Del, South Holland [email protected] [email protected] Kyriakos Psarakis Panagiotis Soilis Del University of Technology Del University of Technology Del, South Holland Del, South Holland [email protected] [email protected] ABSTRACT problem. is is a more challenging problem than the usual One of the best ways for developers to test and improve multi-class classication, where each data point can only their skills in a fun and challenging way are programming have one label aached. An example of a data sample for our challenges, oered by a plethora of websites. For the in- problem is shown in Figure 1. A similar problem would be experienced ones, some of the problems might appear too to predict these tags based on the source code of a solution, challenging, requiring some suggestions to implement a so- but this would restrict the application domain. To be more lution. On the other hand, tagging problems can be a tedious specic, we are interested in applying the system in the con- task for problem creators. In this paper, we focus on automat- text of online programming challenges platforms where no ing the task of tagging a programming challenge description solution or source code is available. us, we only consider using machine and deep learning methods. We observe that the textual description of the problem statements as input to the deep learning methods implemented outperform well- our system. known IR approaches such as tf-idf, thus providing a starting point for further research on the task. 1 INTRODUCTION In recent years, more and more people have started to show interest in competitive programming, either for purely ed- ucational purposes or for preparing for job interviews. Fol- lowing this trend, we can also see an increase in the number of online platforms providing services such as programming challenges or programming tutorials. ese problems are based on a limited number of strategies to be employed. Un- derstanding which strategies to apply to which problem is arXiv:1911.12224v1 [cs.LG] 27 Nov 2019 the key to designing and implementing a solution. However, it is oen dicult to directly infer from the textual problem Figure 1: Data sample example statement the strategy to be used in the solution, especially with lile or no programming experience. us, assisting the programmer by employing a system that can recom- By gathering data from two of the main online program- 1 2 mend possible strategies could prove very useful, ecient ming challenges platforms, Topcoder and CodeForces , and educational. we approach the problem through both machine learning To this end, we propose a system to automatically tag a and deep learning solutions, experimenting with dierent programming problem given its textual description. ese architectures and data representations. Considering the com- tags can be either generic, such as ’Math’, or more specic, plexity of this problem, which is dicult even for humans, such as ’Dynamic Programming’ or ’Brute Force’. Each prob- our hypothesis is that deep learning methods should be an lem can have multiple tags associated with it, thus this re- 1hps://www.topcoder.com/ search is focusing on a multi-class multi-label classication 2hps://codeforces.com eective way of approaching this problem, given enough Regarding deep learning approaches, in [11] the authors data. Based on the aforementioned, the research question analyzed the limitations of BP-MLL, a neural network (NN) that we are trying to answer is the following: ”Could deep architecture aiming at minimizing the pairwise ranking error. learning models learn and understand what are the strategies Additionally, they proposed replacing the ranking loss mini- to be employed in a programming challenge, given only the mization with the cross-entropy error function and demon- textual problem statement?” strated that neural networks can eciently be applied to e rest of the paper is organized as follows: in Section the multi-label text classication seing. By using simple 2 we describe the related work carried out in the literature, neural network models and employing techniques such as both regarding multi-label classication, as well as text rep- rectied linear units, dropout and AdaGrad, their work out- resentation methods. Subsequently, in Section 3 we describe performed state-of-the-art approaches for this task. Further- the process of gathering, understanding and pre-processing more, the research carried out in [8] analyzed the task of the data, while in Section 4 we present the data representa- extreme multi-label text classication (XMTC). is refers to tion methods and the models that we employ. Following that, assigning the most relevant labels to documents, where the in Section 5 we discuss the experimental setup and present labels can be chosen from an extremely large label collection the results, followed by a discussion and reection regard- that could even reach a size of millions. e authors in [8] ing those in Section 6. Finally, we conclude our research in represented the rst deep learning approach to XMTC using Section 7. a Convolutional Neural Network. e authors showed that the proposed CNN successfully scaled to the largest datasets 2 RELATED WORK used in the experiments, while consistently producing the best or the second-best results on all the datasets. Multi-label classifcation When it comes to evaluation metrics in multi-label classi- As far as we are aware, no previous work has been carried out cation scenario, a widely employed metric in the literature is regarding multi-label classication in the context of tagging the Hamming loss [6][13][16][22]. Furthermore, more tradi- programming problem statements. Additionally, most of tional metrics are also used such as the F-measure [6][13][16], the papers tackling this task employ traditional machine as well as the average precision [16][21]. learning methods, rather than deep learning approaches. In [21] the authors propose a multi-label lazy learning approach called ML-kNN, based on the traditional k-nearest neighbor algorithm. Moreover, in [9] the text classication problem is approached, which consists of assigning a text document Text representation into one or more topic categories. e paper employed a Apart from the aforementioned, a lot of literature work has Bayesian approach into multiclass, multi-label document gone into experimenting with dierent ways of representing classication in which the multiple classes from a document text. According to [17], the word representation that has were represented by a mixture model. Additionally, in [6] the been traditionally used in the majority of supervised Natural authors modeled the problem of automated tag suggestion Language Processing (NLP) applications is one-hot encoding. as a multi-label text classication task in the context of the is term refers to the process of encoding each word into ”Tag Recommendation in Social Bookmark Systems”. e a binary vector representation where the dimensions of the proposed method was built using Binary Relevance (BR) and vector represent all the unique words included in the corpus a naive Bayes classier as the base learner which were then vocabulary. While this representation is simple and robust evaluated using the F-measure. Furthermore, a new method [10], it does not encode any notion of similarity between based on the nearest-neighbor technique was proposed in words. For instance, the word ”airplane” is equally dierent [5]. More specically, the multi-label categorical K-nearest to the word ”university” as the word ”student”. e proposal neighbor approach was proposed for classifying risk factors of Word2Vec [10] solves this issue by encoding words into reported in SEC form 10-K. is is an annually led report a continuous lower dimensional vector space where words published by US companies 90 days aer the end of the scal with similar semantics and syntax are grouped close to each year. other. is led to the proposal of more types of text embed- A dierent strategy for approaching the multi-label classi- dings such as Doc2Vec [7] which encodes larger sequences cation problem proposed in the literature is active learning, of text, such as paragraphs, sentences, and documents, into examined in [2]. Furthermore, in [18] the authors proposed a single vector rather than having multiple sectors, ie. one a multi-label active learning approach for text classication per word. One of the more recent language models which based on applying a Support Vector Machine, followed by a produced state-of-the-art results in several NLP tasks is the Logistic Regression. Bidirectional Encoder Representations from Transformers 2 (BERT) [1]. Its main innovation comes from applying bidi- to increase classication performance on textual data since rectional training of an aention model, the Transformer, to they do not provide any additional information and only language modeling. increase the dimensionality of the problem [15]. Following the collection process, we merge the two crawled datasets 3 DATA and preprocess them. In this section, we describe the steps carried out to obtain At rst, HTML tags ¹< :∗? >º are removed since they do the dataset that we have worked with. More specically, we not provide any descriptive information about the problem. focus on presenting the data source and the data collection Aerward, since mathematical denitions are in LATEXformat process, followed by an overview of the preprocessing that on CodeForces and as plain text on TopCoder we decide we perform. Additionally, we show several descriptive sta- to remove them from both to avoid introducing dierences tistics regarding the data and explain the steps we take for between the two sets of data.
Recommended publications
  • Survey on Informatics Competitions: Developing Tasks
    Olympiads in Informatics, 2011, Vol. 5, 12–25 12 © 2011 Vilnius University Survey on Informatics Competitions: Developing Tasks Lasse HAKULINEN Department of Computer Science and Engineering, Aalto University School of Science P.O.Box 15400, FI-00076 Aalto, Finland e-mail: lasse.hakulinen@aalto.fi Abstract. Informatics competitions offer a motivating way to introduce informatics concepts to students and to find new talents. There are many different competitions in the field of informat- ics with different objectives. In spite of these differences, they all share the same need for high quality tasks. Tasks can be seen as the heart of the competition, so the effort put in developing new tasks should not be underestimated. In this survey, the development of tasks and different task types are discussed. The focus is on the Olympiads in Informatics competitions, but tasks in other competitions are discussed as well. Key words: task development, competition tasks, informatics competitions, IOI, competitions and education. 1. Introduction Motivating students to learn is important for all educators. Competitions offer a conve- nient way to bring informatics concepts to students in a different fashion than regular teaching in schools and universities. It could be said that the tasks are the heart of the competition. Therefore, designing tasks that support the goals of the competition is an important and demanding undertaking. Nowadays, there are many different informatics competitions from small to world- wide events. Also, the types of tasks vary from tasks solved with pen and paper to com- plex problems dealing with large datasets and sophisticated algorithms. Many different types of events offer a wide range of possibilities for students to get involved with infor- matics.
    [Show full text]
  • TCO10-Program.Pdf
    10071_TC_TCO10_Program_PQ_1b.indd 1 25/9/10 1:04:24 AM 10071_TC_TCO10_Program_PQ_1b.indd 1 25/9/10 1:04:26 AM Table Of Contents TCO10 Information Sponsors TCO10 Production Credits 3 NSA 1 Founder’s Letter 4 PayPal X Developer Network 22 Tournament Schedule 5 Yandex 32 Meet the Bloggers 7 Facebook 42 The Digital Run 8 TopCoder Admins 41 TCO10 Competition Tracks DESIGN DEVELOPMENT STUDIO Overview 9 Overview 13 Overview 17 Champion 9 Champion 13 Bracket 18 Competitors 10 Competitors 14 Competitors 19 MOD DASH MARATHON ALGORITHM Overview 23 Overview 27 Overview 33 Bracket 24 Bracket 28 Bracket 34 Competitors 25 Competitors 29 Competitors 35 TCO10 2 10071_TC_TCO10_Program_PQ_1b.indd 2 25/9/10 1:04:29 AM TCO10 Production Credits The 2010 TopCoder Open was not possible without the help and dedication from our talented TopCoder members. All the graphics you see onsite and the amazing TCO10 website were done by the TopCoder and Studio community. A special thanks goes to all the members who helped make the TCO10 great! TCO10 WEBSITE TCO10 GRAPHICS TCO10 REVIEWERS Logo Design Sponsor Banner Logo Contests [ rhorea ] [ oninkxronda ] [ rhorea ] [ sigit.a ] TCO10 Banner Icons Contest Icons Design [ kristofferrouge ] [ rhorea ] [ djnapier ] Table Top Graphic Storyboard Storyboard Design [ djnapier ] [ greenspin ] [ leben ] Program Designs UI Prototype UI Prototype [ djnapier ] [ invisiblepilot ] [ leben ] [ selvaa89 ] T-Shirt Design [ bohuss ] Leaderboard Coding and Navigation [ djnapier ] [ YeGGo ] Edits to the Prototype [ AjJi ] Competitions Description Assembly Flash Video [ snow01 ] Content Coding [ r1cs1 ] [ cyberjag ] [ invisiblepilot ] [ BeBetter ] Wordpress Skin Coding [ dweng ] Assembly [ isv ] TCO10 3 Founder’s Letter It’s my great pleasure to welcome each of you to this year’s 2010 TopCoder Open.
    [Show full text]
  • An Empirical Study on Task Documentation in Software Crowdsourcing on Topcoder
    An Empirical Study on Task Documentation in Software Crowdsourcing on TopCoder Luis Vaz Igor Steinmacher Sabrina Marczak MunDDos Research Group School of Informatics, Computing, MunDDos Research Group School of Technology and Cyber Systems School of Technology PUCRS, Brazil Northern Arizona University, USA PUCRS, Brazil [email protected] [email protected] [email protected] Abstract—In the Software Crowdsourcingcompetitive model, challenged by the task description. Thus, in this context, crowd members seek for tasks in a platform and submit their the description (or documentation) of the task presented by solutions seeking rewards. In this model, the task description the platform becomes an important factor for the crowd is important to support the choice and the execution of a task. Despite its importance, little is known about the role of task members—who rely on it to choose and execute the tasks— description as support for these processes. To fill this gap, this and for the clients—who aim to receive complete and correct paper presents a study that explores the role of documentation on solutions to their problems. However, little is known about TopCoder platform, focusing on the task selection and execution. how the documentation influences the selection of the tasks We conducted a two-phased study with professionals that had and their subsequent development in the crowdsourcing model. no prior contact with TopCoder. Based on data collected with questionnaires, diaries, and a retrospective session, we could Seeking to contribute to filling this gap in the literature, this understand how people choose and perform the tasks, and the paper presents an empirical study aiming to understand the role of documentation in the platform.
    [Show full text]
  • Developer Recommendation for Topcoder Through a Meta-Learning Based Policy Model
    Empirical Software Engineering (2020) 25:859–889 https://doi.org/10.1007/s10664-019-09755-0 Developer recommendation for Topcoder through a meta-learning based policy model Zhenyu Zhang1,2 · Hailong Sun1,2 · Hongyu Zhang3 Published online: 5 September 2019 © Springer Science+Business Media, LLC, part of Springer Nature 2019 Abstract Crowdsourcing Software Development (CSD) has emerged as a new software development paradigm. Topcoder is now the largest competition-based CSD platform. Many organiza- tions use Topcoder to outsource their software tasks to crowd developers in the form of open challenges. To facilitate timely completion of the crowdsourced tasks, it is important to find right developers who are more likely to win a challenge. Recently, many developer recom- mendation methods for CSD platforms have been proposed. However, these methods often make unrealistic assumptions about developer status or application scenarios. For example, they consider only skillful developers or only developers registered with the challenges. In this paper, we propose a meta-learning based policy model, which firstly filters out those developers who are unlikely to participate in or submit to a given challenge and then recom- mend the top k developers with the highest possibility of winning the challenge. We have collected Topcoder data between 2009 and 2018 to evaluate the proposed approach. The re- sults show that our approach can successfully identify developers for posted challenges re- gardless of the current registration status of the developers. In particular, our approach works well in recommending new winners. The accuracy for top-5 recommendation ranges from 30.1% to 91.1%, which significantly outperforms the results achieved by the related work.
    [Show full text]
  • Spoj, Topcoder, Github Í Experience May 2014 – Software Engineering Intern, Google, New York
    2609 Orchard Ave Los Angeles, CA-90007 Krishna Bharadwaj H +1 310-691-4078 B [email protected] Spoj, TopCoder, Github Í www.krishnabharadwaj.info Experience May 2014 – Software Engineering Intern, Google, New York. Present Worked on the inline browse data computation for git repositories Sep 2012 – Co-founder & Technology Lead, SMERGERS, Bangalore. Jul 2013 SMERGERS is a SME focused Mergers & Acquisitions Platform. Aug 2011 – Full stack Web Developer, BlockBeacon, PricePoint, Bangalore. Aug 2012 Worked remotely with startups in Santa Monica and San Francisco as an independent consultant. Aug 2011 – Founder & Developer, Refer a Geek, Bangalore. Sep 2012 Refer a Geek aimed at bringing the Referral model of recruting beyond any one company. Jul 2009 – Software Engineer, National Instruments India R&D, Bangalore. Jun 2011 Designed and developed physical layer algorithms for WCDMA/HSPA+ in LabVIEW. Education Fall 2013 – Masters in Computer Science, University of Southern California, Los Angeles. Present Student Programmer, Information Sciences Institute(USC) - Working with Dr. Mehdi Yahyanejad June 2009 BE, Information Science, B. M. S. College of Engineering, Bangalore. Microsoft Student Partner, Placement Coordinator & BMS Linux Users Group. Technical Skills & Interests Programming C/C++, Python, LabVIEW, PHP, Javascript Expertise Data Structures & Algorithms – Spoj, TopCoder, Github Databases MySQL, Oracle, MongoDB Web Dev HTML, CSS, jQuery, Django, CodeIgniter OS Windows & Linux system administraton A Others Qt, NumPy, Matplotlib, LTEX
    [Show full text]
  • If I Am Not Good at Solving the Problems on the Competitive Programming Sites Like Codechef Or Hackerrank, Where Am I Lagging? - Quora
    9/28/2014 If I am not good at solving the problems on the competitive programming sites like CodeChef or Hackerrank, where am I lagging? - Quora QUESTION TOPICS If I am not good at solving the problems on the RELATED QUESTIONS competitive programming sites like CodeChef or HackerRank CodeChef: I am in my third year of Hackerrank, where am I lagging? university now. What should be my Codeforces strategy so that I am comfortable with I am a second year undergrad at one of the IITs and very decent with my CPI Sphere Online Judge solving problems of gene... (continue) (SPOJ) as of now. I have tried quite a few times to start with the likes of above mentioned sites but even the basic level questions take a long time for me to Computer Programming: Why am I CodeChef unable to concentrate in problem solving, get completed? If I know the programming language, if I understand the coding, reading, poor at math? TopCoder questions, where is the fallacy on my part that is preventing me from getting Competitive over them(solving questions) quickly and efficiently? I am quite motivated towards Programming programming, but there is a phase when I am unable to solve most of the problems. Software Follow Question 190 Comment Share 2 Downvote How do I ge... (continue) Programming Languages I am new to competitive programming, just joined CodeChef 10 days back. I am Indian Institutes of Sumit Saurabh finding the easy level questions very Technology (IITs) Edit Bio • Make Anonymous difficu... (continue) Computer Science Write your answer, or answer later.
    [Show full text]
  • Competitive Sport Programming and Open Source Technologies
    Competitive Sport Programming and Open Source Technologies - Ravi Kiran S ( Final Year, CSE ) - Sukin Kumar ( Final Year, CSE ) What is competitive sport programming? What are all the various competitions that happen all over the world? Various Global Programming Competitions: ● ACM-ICPC ( International Collegiate Programming Contest ) ● Google Codejam ● Facebook Hackercup ● Codechef Snackdown and many more... ACM-ICPC: Intro: ● The ACM ICPC is considered as the "Olympics of Programming Competitions". It is quite simply, the oldest, largest, and most prestigious programming contest in the world. ● The ACM-ICPC (Association for Computing Machinery - International Collegiate Programming Contest) is a multi-tier, team-based, programming competition. Headquartered at Baylor University, Texas, it operates according to the rules and regulations formulated by the ACM. The contest participants come from over 2,000 universities that are spread across 80 countries and six continents. Contest Structure: ● Online Round ● ICPC Regional Round ( Regional Sites: Amritapuri, Chennai, Kolkata, Kharagpur, Gwalior in India ) ● ICPC World Finals ACM-ICPC: Eligibility: ● Enrolled in a degree program at an institution. Contest Format: ● It is a team contest. ● Each team should have three members and one reserve (3 + 1). ● Each team must be headed by a coach, who must be a university faculty or staff member. ● The contest can have several problems (8 to 10 in general), of varying difficulty levels and mostly being algorithmic in nature. More Information: ● https://www.codechef.com/icpc Google Codejam: Intro: ● Code Jam calls on programmers around the world to put their skills to the test by solving multiple rounds of algorithmic puzzles. The online rounds conclude in the World Finals, which rotates globally.
    [Show full text]
  • Topcoder — Leveraging a Crowdsourcing Model to Mitigate the Engineering Talent Challenge
    Vendor Profile Topcoder — Leveraging a Crowdsourcing Model to Mitigate the Engineering Talent Challenge Mukesh Dialani Megan Buttita, MLIS IDC OPINION As enterprises focus on realigning their product or services portfolios in view of threats from existing or new competitors and considering changing customer needs and other business priorities, they have to take a serious look at their IT and engineering talent that will be tasked to work on various initiatives including digital. With the added aspiration to contain costs and understand dramatically changing dynamics (especially in the current COVID-19 context), they need to examine their current IT and engineering talent acquisition and retention strategy to include new talent sources that will enable them to accelerate their transformation efforts. Further: . As end customers, our daily aspiration and experience of consuming products and services leveraging a gig model (ride sharing, home services, and hotels) will continue to influence and impact the way we (within the enterprises we operate in) consume IT and engineering services and build and deploy new business/operating models. If not already doing so, enterprises should explore how they should modify certain internal business functions and leverage the gig economy that will enable them to access scalable world-class talent model at a short notice. IN THIS VENDOR PROFILE This IDC Vendor Profile explores Topcoder's (a Wipro company) genesis, evolution, and current strategy and vision. It also examines how Topcoder has revised the rules of a traditional engineering talent sourcing model and the benefits and challenges to customers and engineers that are part of this ecosystem. SITUATION OVERVIEW Customers have reached an elevated level of mature relationships with IT and engineering services partners that act as their extended workforce to support their day-to-day operations and innovation initiatives.
    [Show full text]
  • Competitive Programmer's Handbook
    Competitive Programmer’s Handbook Antti Laaksonen Draft July 3, 2018 ii Contents Preface ix I Basic techniques 1 1 Introduction 3 1.1 Programming languages . .3 1.2 Input and output . .4 1.3 Working with numbers . .6 1.4 Shortening code . .8 1.5 Mathematics . 10 1.6 Contests and resources . 15 2 Time complexity 17 2.1 Calculation rules . 17 2.2 Complexity classes . 20 2.3 Estimating efficiency . 21 2.4 Maximum subarray sum . 21 3 Sorting 25 3.1 Sorting theory . 25 3.2 Sorting in C++ . 29 3.3 Binary search . 31 4 Data structures 35 4.1 Dynamic arrays . 35 4.2 Set structures . 37 4.3 Map structures . 38 4.4 Iterators and ranges . 39 4.5 Other structures . 41 4.6 Comparison to sorting . 44 5 Complete search 47 5.1 Generating subsets . 47 5.2 Generating permutations . 49 5.3 Backtracking . 50 5.4 Pruning the search . 51 5.5 Meet in the middle . 54 iii 6 Greedy algorithms 57 6.1 Coin problem . 57 6.2 Scheduling . 58 6.3 Tasks and deadlines . 60 6.4 Minimizing sums . 61 6.5 Data compression . 62 7 Dynamic programming 65 7.1 Coin problem . 65 7.2 Longest increasing subsequence . 70 7.3 Paths in a grid . 71 7.4 Knapsack problems . 72 7.5 Edit distance . 74 7.6 Counting tilings . 75 8 Amortized analysis 77 8.1 Two pointers method . 77 8.2 Nearest smaller elements . 79 8.3 Sliding window minimum . 81 9 Range queries 83 9.1 Static array queries .
    [Show full text]
  • Hfs Blueprint Report: Enterprise Blockchain Services November 2017
    The Services Research Company HfS Blueprint Report: Enterprise Blockchain Services November 2017 Authors: Saurabh Gupta Tanmoy Mondal Chief Strategy Officer Knowledge Analyst [email protected] [email protected] © 2017 HfS Research Ltd. Confidential and Proprietary │Page 1 2j14g6e0 Table of Contents Topic Page Definitional Frameworks 4 Executive Summary 9 State of the Enterprise Blockchain Services Market 12 Research Methodology 24 Service Provider Analysis 27 Service Provider Profiles 31 Market Predictions 53 About HfS 56 © 2017 HfS Research Ltd. Confidential and Proprietary │Page 2 2j14g6e0 Introduction to the 2017 HfS Blueprint Report: Blockchain Services Distributed ledger technologies, including blockchain, promises to fundamentally change and disrupt business models, potentially as significantly as the internet itself. HfS recognizes blockchain as a Horizon 3 change agent for digital operations – with significant value potential but still nascent for mainstream adoption. The 2017 Enterprise Blockchain Services Blueprint investigates the blockchain space to provide a comprehensive and foundational analysis of the blockchain solutions and services market for enterprises. The report defines the blockchain ecosystem and analyzes the relative adoption of blockchain services across all industries and use cases. It focuses on blockchain capabilities of 21 service providers. The report also covers market trends, value proposition and challenges, and predictions for enterprise blockchain services. Unlike other
    [Show full text]
  • Tccc04 Program.Pdf
    �������������� TABLE OF CONTENTS Founder’s Message 2 ������������������ Algorithm Competition Brackets 3 Semifinal Room 1 5 Semifinal Room 2 11 ���������������������� Semifinal Room 3 19 Component Competition Brackets 25 Design Final Room 26 ������������������ Development Final Room 28 Review Boards 30 SCHEDULE OF EVENTS Wednesday, April 14 6:00pm - 8:00pm Welcome Reception Thursday, April 15 ������������������������������������������ 8:30am - 9:30am Breakfast, Sponsored by Yahoo! 10:00am - 11:45am Semifinal Room 1 �������������������������������������������� 11:45am - 1:00pm Lunch �������������������������������������������� 1:00pm - 2:45pm Semifinal Room 2 4:00pm - 5:45pm Semifinal Room 3 ���������������������������������������� 7:00pm - 8:45pm Wildcard Round 8:30pm - 11:00pm Yahoo! Gathering at MJ O’Connor’s Friday, April 16 ����������������� 7:30am - 9:00am Breakfast 8:00am - 12:00pm Component Competition Finals 12:00pm -1:00pm Lunch 1:00pm - 2:00 pm Yahoo! Technical Presentation 12:00pm - 3:30pm Review of Component Competition 3:00pm -4:45pm Coding Tournament Finals 5:00pm - 6:00pm Media Hour/Press Conference 6:00pm - 8:00pm Dinner and Awards Presentation Table of Contents 1 FOUNDER’S MESSAGE ALGORITHM COMPETITION Welcome to the 2004 TopCoder® Collegiate Challenge, sponsored by Yahoo!® Since last year’s Collegiate Challenge, TopCoder has grown by more than 13,000 Semifinal Rounds Wildcard Round Final Round Champion members. Membership continues to steadily increase and currently stands at more than 38,000 members. Many of the finalists will be familiar to long-standing TopCoder members. With the membership growth, some new faces will spice up tomek the competition. bstanescu We started the Algorithm Competition of the TCCC 04 with more than 700 Eryx students. Of the final 24, eight have been previous onsite finalists, but only one is a returning finalist from the 2003 Collegiate Challenge, and 16 are here for the AdrianKuegel first time.
    [Show full text]
  • Competitive Programmer's Handbook
    Competitive Programmer’s Handbook Antti Laaksonen Draft April 3, 2017 ii Contents Preface ix I Basic techniques 1 1 Introduction 3 1.1 Programming languages . .3 1.2 Input and output . .4 1.3 Working with numbers . .6 1.4 Shortening code . .8 1.5 Mathematics . .9 1.6 Contests . 15 1.7 Resources . 16 2 Time complexity 17 2.1 Calculation rules . 17 2.2 Complexity classes . 20 2.3 Estimating efficiency . 21 2.4 Maximum subarray sum . 21 3 Sorting 25 3.1 Sorting theory . 25 3.2 Sorting in C++ . 29 3.3 Binary search . 31 4 Data structures 35 4.1 Dynamic arrays . 35 4.2 Set structures . 37 4.3 Map structures . 38 4.4 Iterators and ranges . 39 4.5 Other structures . 41 4.6 Comparison to sorting . 44 5 Complete search 47 5.1 Generating subsets . 47 5.2 Generating permutations . 49 5.3 Backtracking . 50 5.4 Pruning the search . 51 iii 5.5 Meet in the middle . 54 6 Greedy algorithms 57 6.1 Coin problem . 57 6.2 Scheduling . 58 6.3 Tasks and deadlines . 60 6.4 Minimizing sums . 61 6.5 Data compression . 62 7 Dynamic programming 65 7.1 Coin problem . 65 7.2 Longest increasing subsequence . 70 7.3 Paths in a grid . 71 7.4 Knapsack . 72 7.5 Edit distance . 73 7.6 Counting tilings . 75 8 Amortized analysis 77 8.1 Two pointers method . 77 8.2 Nearest smaller elements . 80 8.3 Sliding window minimum . 81 9 Range queries 83 9.1 Static array queries .
    [Show full text]