Computational Thinking with the Web Crowd using CodeMapper∗

Patrick Vanvorce Hasan M. Jamil Department of Computer Science Department of Computer Science University of Idaho, USA University of Idaho, USA ([email protected] [email protected]

ABSTRACT ACM Reference format: It has been argued that computational thinking should precede com- Patrick Vanvorce and Hasan M. Jamil. 2019. Computational Œinking with the Web Crowd using CodeMapper. In Proceedings of Conference, Date, Year, puter programming in the course of a career in computing. Œis Location., , 8 pages. argument is the basis for the slogan “logic €rst, syntax later” and DOI: hŠp://dx.doi.org/DOI the development of many cryptic syntax removed programming languages such as Scratch!, and Visual Logic. Œe goal is to focus on the structuring of the semantic relationships among the logical building blocks to yield solutions to computational prob- 1 INTRODUCTION lems. While this approach is helping novice programmers and Œe di‚erences between computing and computational thinking are early learners, the gap between computational thinking and pro- signi€cant and can be explained in a number of ways. According to fessional programming using high level languages such as C++, JeanneŠe Wing [30], ”computational thinking confronts the riddle Python and Java is quite wide. It is wide enough for about one of machine intelligence: What can humans do beŠer than comput- third students in €rst college computer science classes to drop out ers, and what can computers do beŠer than humans? Most funda- or fail. In this paper, we introduce a new programming platform, mentally, it addresses the question: What is computable? Today, called the CodeMapper, in which learners are able to build com- we know only parts of the answers to such questions.” In particular, putational logic in independent modules and aggregate them to computational thinking is 1) conceptualizing, not programming, create complex modules. CodeMapper is an abstract development 2) a way that humans, not computers, think, 3) complements and environment in which rapid visual prototyping of small to substan- combines mathematical and engineering thinking, and 4) it is ideas, tially large systems is possible by combining already developed not artifacts. Our MindReader project [11] aims at implementing independent modules in logical steps. Œe challenge we address these ideas into a system for online autonomous learning. involves supporting a visual development environment in which Millions of K-12 students by now are conversant in the graphical “annotated code snippets” authored by the masses in social com- language Scratch (developed at MIT). However, the language is puting sites such as SourceForge, StackOverƒow or GitHub can incredibly clunky if one wants to go beyond what it is designed to also be used as is into prototypes and mapped to real executable do (make simple games). A derivative language, Snap! (developed programs. CodeMapper thus facilitates so‰ transition from visual at UC Berkeley), expands its horizons, making functions practical, programming to syntax driven programming without having to enabling object oriented programming, and €xing other de€ciencies practice syntax too heavily. in Scratch. Yet, many educators believe that it is a fantasy to expect the masses to transition from Scratch (or Snap!) to what we now CCS CONCEPTS think of as conventional programming languages such as C++, •‡eory of computation →Semantics and reasoning; Design Python or Java. A more plausible future is for specialized languages and analysis of algorithms; •Human-centered computing →Social to spring up that make use of the conventions of Scratch/Snap! content sharing; Collaborative content creation; Blogs; but incorporate the knowledge and concepts of a €eld of interest

arXiv:1811.04162v1 [cs.CY] 9 Nov 2018 to a target audience. It isn’t dicult to see a world where almost KEYWORDS everyone speaks Scratch (like almost everyone knows algebra), and Computational thinking; ; rapid prototyp- when we go into, for example, molecular biology, we naturally ing; crowd-computing; conceptual programming adopt Scratch/Genome, or if one wants to go into accounting, she adopts Scratch/Accounting. In other words, a discipline speci€c ∗Research was supported in part by National Science Foundation Grant DRL 1515550. toolbox or plug-in will tailor how the programming environment would interface with its users. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed While comparative student-learning behaviors in di‚erent STEM for pro€t or commercial advantage and that copies bear this notice and the full citation (Science, Technology, Engineering and Mathematics) areas is un- on the €rst page. Copyrights for components of this work owned by others than the known, anecdotally it is believed that biologists usually prefer author(s) must be honored. Abstracting with credit is permiŠed. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speci€c permission abstract graphical tools more than physicists or mathematicians and/or a fee. Request permissions from [email protected]. who prefer details. So, it is not surprising that computational physi- Conference, Date, Year, Location. cists insist on a more hands on computer programming experience © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. ISBN ISBN...$15.00 than computational biologists advocating visual or NLP based pro- DOI: hŠp://dx.doi.org/DOI gramming. We believe that students in di‚erent disciplines from di‚erent background have unique learning trait and learn di‚er- SoureForge or GitHub. Œe origin of the concept of crowd comput- ently. Œus, contrary to the idea that one language with discipline ing [1] as a distinct discipline can be traced back to its informal speci€c toolboxes or plugins is the way forward, we assemble a roots to sites such as these. While crowd computing and debug- constellation of programming environments in MindReader that ging [5, 28] opened up new ways learners and practitioners can is discipline agnostic, but learning style speci€c, that will allow a improve and sharpen their skills, they do not directly contribute to learner to move between the environments based on her comfort the learners’ own program development processes. In CodeMapper, zone. We believe by keeping in view what computational thinking we aim to leverage the crowd to aid e‚ortless abstract program is about, and learners’ individuality, we are able to design a truly development to support the goal of reducing transitional cost from impactful system to support the vision of the educational standards block-based to text-based coding for budding programmers. such as “Common Core State Standards Initiative” (CCSS) [2] and “Next Generation Science Standard” (NGSS) [23]. 2.1 ‡e Main Idea Behind CodeMapper CodeMapper’s visual interface allows learners to stay in their con- 2 SOCIAL PROGRAMMING USING ceptual world similar to a block-based programming environment CROWD-SOURCED MODULES and yet write computer programs in a text-based imperative lan- guage such as Java, C++ or Python. In this world, solutions to prob- Programming can be challenging for young programmers and lems are still a series of logical steps with well de€ned meanings, STEM learners due to a host of reasons with abstract thinking, and any arbitrary implementation of these steps in any program- complex syntax, mapping seemingly simple steps into algorithms, ming language will eventually amount to an executable program. etc. being the leading ones. For K-12 and early college STEM CodeMapper consists of a database D of concepts C, a concept learners, a substantial number of online teaching, tutoring and hierarchy H, a concept to program mapper µ, and a program ag- assessment systems are emerging with various degrees of impact. gregator α. A concept c ∈ C is an arbitrary code fragment in an In most part, these systems and tools have done liŠle to reduce imperative language in the concept hierarchy H which is organized high drop-out rates [27], or to aid learning, and resource strapped into a generalization-specialization hierarchy DAG, e.g., merge sort institutions continue to struggle to €nd beŠer ways to train new and insertion sort are sorts. But merge sort itself can be broken century workforce conversant in computing [26]. down to the following steps, and thus is an aggregation of three One of the critical decisions educators have been grappling with concepts – divide list, merge sort a list, and merge two sorted lists. is how much exposure to programming is enough to make stu- dents interested and ultimately pro€cient in professional system (1) Divide the list in half building [25], and what languages to choose for this purpose [18]. (2) If the €rst half has only one element, return element, oth- While computational thinking is not speci€c about any language, erwise merge sort the €rst half it is imperative that one or a set of programming platforms, e.g., (3) If the second half has only one element, return element, block based (e.g., Scratch, Snap!, Alice), declarative (e.g., Prolog, otherwise merge sort the second half Clasp), imperative (e.g., C++, Java, Python), visual (e.g., Visual (4) Merge both halves back together Logic, Cameleon, Snap!), programming by example (e.g., Foofah, Each concept c is either harvested from social computing sites BlinkFill) or natural language (e.g., AppleScript, Metafor, Flip) such as StackOverƒow, SoureForge or GitHub, or are wriŠen by among many other possible alternatives, need to be chosen. While learners themselves, and are independent code segments. Œey the answers to these questions are still outstanding, most programs are wriŠen in modular forms such as formal procedures or func- use an imperative language as the main platform even if they start tions and not necessarily wriŠen keeping variable name or type o‚ with a language such as Scratch. compatibility in mind. However, they are all annotated to describe It is worth noting that most of the programming languages cap- their input-output behaviors and the associated variables. Given ture the basic constructs of imperative programming – assignment a set of concepts, and their precedence ordering relationships, the and computation, decision, and iteration. Œe block and frame aggregator function α harmonizes the variables using a schema based languages primarily adopt a logic €rst, syntax later approach matcher such as S-Match [7]. Œe code mapper function µ selects [17]. Block languages aim to emphasize conceptual clarity, ease an appropriate concept while mapping a set programming steps as of coding and simplicity and thereby deemphasize programming the merge sort above, and aims to €nd the most specialized concept rigor, computational power and other essential features imperative possible before assimilating the corresponding code segment in languages such as C++, Java and Python have, a feature that made database D to aggregate. Œe ultimate aggregate, an executable BASIC a popular in the early ages of microcomputers. Œus the program P, is a schema heterogeneity resolved tree of concepts T , adoption of block-based languages as an introductory language i.e., P = α(µ(T )). Œe role of the crowd in CodeMapper, however, is is premised upon the fact that once pro€cient in computational the task of placing the concepts in the hierarchy H as a community, thinking, learners will transition to a text based and more powerful and annotating and curating the annotations and the aggregate language such as C++, Java or Python accepting some transition concepts. In that sense, CodeMapper will be supported by the devel- cost; the cost of moving from a non-text to a text-based program- oper community through crowdsourcing similar to IMPACT [21] ming world [29]. Œe primary goal of the CodeMapper system is to and Crowd Debugging [3]. reduce this transitional cost using a novel approach. While CodeMapper may have signi€cant similarities with Scratch Novice and seasoned programmers alike, o‰en need help and and Blockly, it is fundamentally di‚erent. In CodeMapper, students they seek it from social computing platforms such as StackOverƒow, write codes or import codes in text-based languages, and thus Figure 1: MindReader front-end. they must be somewhat familiar with the syntax, the degree of her submiŠed assignment online. MindReader also supports and which could be quite wide – from elementary to well-versed. Œe assembles an array of powerful features under one umbrella to o‚er programming itself is still visual in conceptual steps. Users of a constellation of learning tools for both teachers and students. In CodeMapper are able to pick concepts from the hierarchy H, and particular it currently supports tutoring and assessment of C++, sequence them to describe a solution, and then with the click of a a visual programming by example language called Patch [12], an buŠon, the entire program is wriŠen for them. By executing, and automated and conversational instructional agent called vTutor observing the computation, they are able to re€ne and improve the [14], and a social peer support system called OpenSchool [13]. Œe program at the highest level of perfection possible in the database CodeMapper system introduced in this paper is designed to be an D, allowed by the learner’s sophistication. integral part of the MindReader system as well to help learners transition from block-based languages to C++. 2.2 ‡e MindReader System In recent years, numerous conversational intelligent tutoring and 3 BACKGROUND AND RELATED RESEARCH assessment systems (CITS) have been developed. Numerous meta- CodeMapper draws upon experiences of past research in teaching analyses of the research €ndings suggest that the choice of tutoring computer science (CS) and web-based teaching in general, pro- systems does not impact the learning outcome substantially and gramming with the crowd, and knowledge summarization and that the learners psychological make up, learning habits and anxi- structuring. Past research on CS education is varied but the one eties o‰en negated the technology choice [19], although the use of issue that is probably seŠled is that there are no simple shortcuts to learning management systems (LMS) always helped. Keeping these CS education [9] and the jury is still out on what is truly e‚ective in mind, we have developed a fully autonomous online CITS, called [10, 24]. Apart from the content and platform debate, researchers the MindReader, that includes an array of functions and features to are also experimenting with various pedagogy to make the lan- help K-12 and early college STEM learners develop computational guage selection more e‚ective using new learning environments thinking. Our system draws upon the combined experiences of the such as immersive learning, game-based learning, blended learning, Python Tutor [8] in its role as a program execution visualizer, and personalized, self-regulated and self-paced learning, social learning, feedback generation techniques proposed by Martin et al. [20] to peer learning, pair programming, etc. aid comprehension through misconception elimination. Figure 1 While leveraging crowd [1] for computing is gaining steady shows the MindReader interface giving feedback to a student on traction, it has not been a focal point for CS education yet except for a few notable tangential research [5, 16]. Our own recent e‚ort in OpenSchool [13], and the current research on CodeMapper aim to leverage social computing platforms for the purpose of teaching and learning. Similarly, e‚orts in structuring knowledge to aid computer programming education is rare although there have been some e‚ort in using ontologies or structured knowledge to aid CS education [31]. However, the idea of concept hierarchy that we are pursuing has not been directly researched in computing literature. In bio-informatics, functional similarity of biological concepts such as protein functions, miRNA and genes have been on going for some time. For the current edition of CodeMapper, we plan to use the crowd for such determination and annotation instead of a more algorithmic approach. In an algorithmic approach, functionally summarizing graphs appears to be a logical choice [22] though we have borrowed ideas from summarizing ontologies based on RDF Figure 2: CodeMapper front-end. sentence graphs [32].

4 CODEMAPPER: APPROACH AND and execution. Alternately, they can also copy the code directly UNIQUENESS from CodeMapper and paste into applications such as Visual Studio or Unity3D editor. Our main goal is to develop a programming platform that is visual, concept oriented, ƒexible and extensible. In this platform, concepts are searchable, using simple keyword queries, and selectable from a DAG like concept hierarchy in the CodeMapper dash board. While learners use an imperative language such as C++ or Java to program, they still use block-like icons that represent a concept and describe them. A concept can be simple or an aggregation of several concepts. Œus the expansion of a concept generally returns a code tree. Our goal is for CodeMapper to have the ability to allow novice programmers to tinker around with ideas, focus on thinking like a programmer, rather than concentrating on syntax. We achieve this by allowing users to string together concept icons in the DAG to create runnable programs in a programming language of choice. At the same time, we aim for the system to be powerful enough so that users can create complex programs in the form of a graph-like structure and generate functional programs to solve challenging problems. As mentioned earlier, the annotation of nodes in the DAG by the crowd is designed to educate CodeMappers users about the icon functionalities and properties so that the €rst two goals of the system can be met accurately.

4.1 Concept Based Programming Figure 3: Program induced by a concept DAG. CodeMapper uses the idea of building a directed acyclic graph that directly maps to a functional program. Requiring a user program to be a DAG of icons, a concept DAG, ensures a de€nitive start and Writing computer programs using an interface such as CodeMapper end point, all the while preventing a non terminating loop of tasks has several advantages. In such an environment, learners are able to occur. Œe front-end interface of CodeMapper is shown in €gure to think about the program from a high level, and focus more on the 2 which is the gateway to all its functional features. end goal while programming at an almost native language level. In Users write programs in CodeMapper by selecting concepts from this approach, users are also able to develop programs using close the concept hierarchy on the dash board, dragging and dropping to everyday language rather than programming languages, so that them on the canvas, and connecting them using arrows to create a they are more focused on problem solving as opposed to program- precedence graph. Œey then can select a programming language ming language idiosyncracies. In CodeMapper, the user has the for CodeMapper to map the concept graph into an executable code. opportunity to develop the concept of thinking like a programmer In €gure 3, a C# mapping is shown. It is interesting to note how the as they piece together these complex concept graph structures, and ƒow of the node icons directly correlates to the structure of the gen- focus on designing a logical and most optimized solution available. erated program. CodeMapper being part of the MindReader system, CodeMapper autonomously takes care of the optimization using users are now able to pipe this code into the IDE for compilation the recursively de€ned concept hierarchy by selecting the most speci€c concept. Figure 4 shows the conceptual implementation in similar to sites such as StackOverƒow or Github, in which crowd CodeMapper of a sorting program by a learner. sourcing fuels it’s success. 5 OPEN SOURCE IMPLEMENTATION We have developed CodeMapper as an open source system so that it can be developed as a community project and leverage the combined ideas of all. Œe source code of CodeMapper project publicly avail- able at hŠps://github.com/maholeycowdevelopment/CodeMapper. It is therefore open to contributions and will soon be under the MIT license. We plan to create a suggestions page to solicit ideas about desired features and implementation strategies to facilitate its continuous development. 5.1 So ware and Platform Tools CodeMapper is built with Microso‰’s ASP.NET Core 2.0 platform with Facebook’s front-end library React. Œis choice was motivated by its ƒexibility and extensibility. Œis platform allows ƒexible Figure 4: Concept-based sorting program implementation. addition and removal of functionality, and thus supports incremen- tal design while focusing on the end goal. Facebook’s React was motivated by the need for dynamic user experience and user inter- 4.2 Database Powered by Crowd-Sourcing face operations, and for leveraging React’s ability to allow building reusable components. Several command line tools such as node Every code segment associated with a terminal concept is also package manager, .NET common line tools, and yarn package man- represented using a graph, called the program dependence graph ager were also used. For back-end system development, we have (PDG) [6]. Clustering of these graph structures helps organize them used either a yarn or npm development server, and the IIS Express in the concept hierarchy into functionally similar nodes as struc- along with the .NET command line servers. turally similar graphs exhibit similar execution behaviors. However, structurally dissimilar graphs can also be functionally identical. For 5.2 Hardware and Operating System example, though the PDGs corresponding to insertion sort and CodeMapper was implemented on MacBook Pro running macOS quick sort are structurally di‚erent, they are functionally identi- High Sierra version 10.13.1, having a 2.8 Ghz Intel Core i7 processor, cal – sorting. It therefore becomes necessary that we engage the 16 GB 2133 MHz LPDDR3, Radeon Pro 555 2 GB Intel HD Graphics crowd to annotate each concept node, and manually establish their 630 1536 MB. Much of the development was done on a virtual functional similarity. Figure 5 shows the annotated information machine utilizing Parallels so‰ware. CodeMapper is compliant with available for each of the concepts entered using the form in €gure most widely used browsers such as Google Chrome, Microso‰ Edge, 6. Furthermore, the granularity of concepts could also be varied. Internet Explorer, Safari, and Fire Fox. We tested the so‰ware on For example, it need not be a complete procedure. We could also di‚erent machines of varying con€gurations and did not experience have a concept called counter loop, specialization of which could be any performance related glitches to date. a for counter loop and a while counter loop with respective for and while statement incarnations. 5.3 Online Code Harvesting Manual curation by crowd is also necessary since concepts can be aggregations of simple concepts (called terminal), or complex In order to make CodeMapper a truly powerful tool, users should concepts (concepts de€ned using other simple or complex concepts be able to scrape the internet for code snippets to be used in node recursively). Most complex concepts that represent similar func- creation. It supports the search and €nd interface shown in €gure 8 tions cannot be clustered using PDG clustering. For example, the for this purpose. In this interface, search for codes can be initiated node structure in €gure 7 is created or curated by the crowd to show using a brief description of target code. CodeMapper then initiates heap, merge and radix sorts each to be a special type of ascending search using the APIs provided by sites such as GitHub, Source- sort. As discussed in section 2.1, merge sort is a complex concept, Forge, GitBucket, etc. Œe API’s are then interrogated using NLP and so is quick sort. While it is conceivable that most merge sort techniques to pick out speci€c terms. Once the results are returned and quick sort PDGs are group wise similar, it is unlikely for the in a list as shown, users pick and choose snippets of code as needed merge sort and quick sort PDGs to be similar, although they func- to create nodes. Users are able to copy the code snippets into the tion similarly. Œe annotations and their position in the concept system database for cataloging in ways discussed in section 4.2. hierarchy thus becomes important in CodeMapper, allowing the system to use the speci€city and referencing (aggregations) relation- 5.4 Experimental Evaluations ships among concepts to make smart decisions in synthesizing high We have tested CodeMapper positively for its functional correctness. quality programs. From this perspective, concepts and their realiza- In our lab test, programming. In our user experience tests, learners tion in CodeMapper have strong similarity with those of knowledge without signi€cant coding experience found it useful and easier to representation of ontologies, and we envision CodeMapper to be understand. Œe most signi€cant insight from this process was that Figure 5: Detail database view of concepts.

Figure 7: Example concept hierarchy.

Figure 6: Data entry form for crowd annotation. looking at the produced code and the DAG they built. However, a more systematic evaluation is being planned. Contingent upon its adoption, it would be a signi€cant step for the viability and success of crowd-sourcing based rapid prototyping students showed improvement in thinking about program design of programs, and a program development platform for novice and without the nuances of a language’s syntax. We also noted that the veteran programmers alike. We also envision its adoption as an majority of students stated they were able to solidify programming intelligent and shared crowd enabled community so‰ware devel- concepts and develop a deeper understanding of programming by opment platform where users can grow their skills and mature implement in an application, but do not want to spend the man hours needed to research how to do this. Œey can now turn to CodeMapper and create a quick prototype of the desired features, test it, and then make modi€cations as necessary. Œis will bene€t companies, users, and the technology community. As the com- munity database of CodeMapper grows, it is not far-fetched to imagine CodeMapper becoming a vehicle for professional system development in the foreseeable future.

5.7 Program Dependence Graph Summarization One of the components of CodeMapper is the concept hierarchy H that organizes all the concepts learners use to develop conceptual solutions to problems. Œe accuracy of the hierarchy is important for two main reasons. First, the hierarchy is used by the mapping Figure 8: Online code search and €nd interface. function µ in order to select most up to date and appropriate code segments. Second, structuring the concepts in a specialization- generalization hierarchy helps learners select the most abstract together. Œe success of its graph summarization could also be (general) concept possible, which will ultimately map to the most useful in the area of community debugging research. speci€c code anyway. Œis independence allows learners to think solutions and not techniques. As mentioned, while technologies 5.5 Empowering Novice Programmers exist to cluster similarly functioning codes that have similar PDGs, Since CodeMapper focuses on incremental code development us- no technology is suitable to cluster similarly functioning codes ing concepts, in ways similar to blocks in Scratch and Blockly, with divergent PDGs. We are thus unavoidably dependent on the learners are able to focus more on what to solve, and not how to power of the crowd for subjective and manual clustering toward solve. From this viewpoint, CodeMapper is closer to the declar- the construction of the concept hierarchy. ative . Learners have the opportunity to To automate the process, we believe a deductive knowledge base build and reuse concepts, and their implementations, themselves can be created and curated by the crowd, the rules of which can or leverage the crowd. Since the actual integration of the indepen- then be used to identify similarly functioning divergent codes (thus dently constructed code segments and resolution of their disparities vastly di‚erent PDGs) and clustered as similar concepts. In fact, it are performed by CodeMapper using well established techniques, is possible to express the de€nition of a merge sort, for example, as novice programmers need not invest too much time to construct described in section 2.1 as a set of logic rules. Œen these rules can new programs the way traditional programmers still do. Œe saved be mapped to codes that can faithfully implement those rules, not time can now be used to explore alternate implementations, or necessarily the same way. An experimental implementation has discover the pitfalls of the design at hand because a low quality been discussed and utilized in MindReader [11], and is currently (performance or construction wise) concept was chosen. Once a undergoing evaluation. But this experimental system is manual concept graph is built, the mapping to real code by CodeMapper and must be orchestrated by users and thus, is not scalable. Our and the exploration of the generated codes will allow users to make hope is that leveraging the power of crowd, it will be possible to connections between syntax and semantics. scale the approach and simplify the concept hierarchy generation Furthermore, since CodeMapper is powered by crowd sourcing, for CodeMapper. learners can observe the emergence of new concepts and their evolution that is not possible in static and non-evolving system 5.8 Limitations and Future Improvements such as Blockly or Scratch. Œe modularity inherent in the concepts Being crowd reliant, CodeMapper inherits typical drawbacks of and their implementation allows research and comprehension of this paradigm, yet being a community system, it also has some algorithms in a structured fashion once they are expanded and serious responsibility. For example, the annotation accuracy is helps learners sharpen their abilities as a programmer. Learners largely curator dependent and a weak, incompetent or malicious also have the opportunity to take part in developing some of the curator may jeopardize the system, and thus a users ability to concepts and their annotations introducing them to the process of construct meaningful and useful programs. One of two possible open source system development. Such collaborative experience in solutions appears appropriate. Œe standard approach practiced in system building will be bene€cial for learners’ abilities to contribute similar circumstances such as GitHub is to accept only administrator to school projects, internships, personal projects and so on. curated annotations, and allow local overriding by any user on their own without a‚ecting the community annotation. Œe second 5.6 Industrial Relevance approach is to use an inaccurate model where the reliability of an We believe that industrial developers can also utilize CodeMapper annotation is dependent upon the combined credibility of all the for serious system building. In many instances, engineers are look- curators who contributed to the annotation, an approach adopted ing for strategies to rapidly prototype concepts that they want to in CrowdCure system [15]. Œe current edition of CodeMapper does not allow hypothetical Education, SIGCSE ’13, Denver, CO, USA, March 6-9, 2013, pages 579–584, 2013. concepts, or their implementation, to be included in a concept graph [9] M. Hassinen and H. Mayr¨ a.¨ Learning programming by programming: A case study. In Proceedings of the 6th Baltic Sea Conference on Computing Education for a solution. It also does not allow mixing implementations in Research: Koli Calling 2006, Baltic Sea ’06, pages 117–119, New York, NY, USA, mixed languages. But it may so happen that a learner is only familiar 2006. ACM. [10] F. Hermans and E. Aivaloglou. To scratch or not to scratch?: A controlled with Scratch and is able to implement a concept using it well, but not experiment comparing plugged €rst and unplugged €rst programming lessons. In so much using C++. Allowing her to add a Scratch implementation, Proceedings of the 12th Workshop on Primary and Secondary Computing Education, or choosing an available Python implement, to other components WiPSCE 2017, Nijmegen, Še Netherlands, November 08 - 10, 2017, pages 49–56, 2017. of her concept graph in C++, could o‚er a workable solution. For [11] H. M. Jamil. Automated personalized assessment of computational thinking this to be executable, a translation from Scratch or Python to C++, MOOC assignments. In Proceedings of Še 17th IEEE International Conference or vice versa, will be necessary. A system such as Flowgorithm [4] on Advanced Learning Technologies, ICALT 2017, Timisoara, Romania, July 3-7, pages 261–263, 2017. actually allows cross language mapping from abstract ƒow charts, [12] H. M. Jamil. Visual computational thinking using Patch. In Proceedings of Še and a similar approach can be used in CodeMapper as well. 16th International Conference on Web-based Learning, ICWL 2017, Cape Town, South Africa, September 20-22, pages 208–214. Springer, LNCS 10473, 2017. Œe current implementation also does not support partial evalu- [13] H. M. Jamil. A free-choice social learning network for computational thinking. ation of concepts within a larger concept graph to test ideas locally. In Proceedings of Še 18th IEEE International Conference on Advanced Learning Œis also means that the solution must be fully conceived and tested, Technologies, ICALT 2018, Mumbai, India, July 9-13, 2018. To appear. [14] H. M. Jamil, X. Mou, R. B. Heckendorn, C. L. Je‚ery, F. T. Sheldon, C. S. Hall, and no part of it can be independently evaluated. We would like and N. M. Peterson. Authoring adaptive digital computational thinking lessons to device a mechanism in the future to support such incremental using vTutor for web-based learning. In Proceedings of Še 16th International evaluation and idea exploration that we believe will be bene€cial Conference on Web-based Learning, ICWL 2018, Chiang Mai, Šailand, August 22-24. Springer, 2018. To appear. to novice and seasoned programmers alike. In that vein, we also [15] H. M. Jamil and F. Sadri. Crowd enabled curation and querying of large and believe that an auto suggestion of possible concepts can prove to noisy text mined protein interaction data. Distributed and Parallel Databases, be useful. In particular, given a problem, certain concepts being 36(1):9–45, 2018. [16] M. Kaya and S. A. Ozel.¨ Integrating an online compiler and a plagiarism detection considered by the learner may not make sense, and rather than tool into the moodle distance education system for easy assessment of program- waiting for MindReader to catch the discrepancy and give feedback ming assignments. Comp. Applic. in Engineering Education, 23(3):363–373, 2015. [17] M. Kolling,¨ N. C. C. Brown, and A. AlTadmri. Frame-based editing: Easing the later, CodeMapper may suggest making alternative choice and avert transition from blocks to text-based programming. In WiPSCE, pages 29–38. an eventual failure. In this role, CodeMapper will essentially be ACM, 2015. acting as an online tutor for novice programmers. [18] W. M. Kunkle and R. B. Allen. Œe impact of di‚erent teaching approaches and languages on student learning of introductory programming concepts. TOCE, 16(1):3:1–3:26, 2016. 6 CONCLUSION AND FUTURE RESEARCH [19] W. Ma, O. O. Adesope, Nesbit, J. C., and Q. Liu. Intelligent tutoring systems and learning outcomes: A meta-analysis. Journal of Educational Psychology, Œe focus of this paper has been to articulate a novel idea of so‰en- 106(4):901–918, 2014. ing the transition cost of novice programmers and K-12 STEM learn- [20] V. J. Martin, T. Pereira, S. Sridharan, and C. R. Rivero. Automated personalized feedback in introductory java programming moocs. In Proceedings of the 33rd ers from block-based languages to text-based imperative languages IEEE International Conference on Data Engineering, San Diego, California, USA, and foster their computational thinking. While the CodeMapper April 19-22, pages 1259–1270, 2017. system is at the prototype stage, most of its core components have [21] T. MaŠauch. Innovate through crowd sourcing. In Proceedings of the 41st Annual ACM SIGUCCS Conference on User Services, SIGUCCS ’13, pages 39–42, New York, been developed and tested, but its integration with the sister system NY, USA, 2013. ACM. MindReader remains as a future research. We have also discussed [22] Y. Miao, J. Qin, and W. Wang. Graph summarization for entity relatedness some of the planned future improvements in section 5.8 and its visualization. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’17, pages 1161–1164, known limitations. We believe that though CodeMapper is a part of New York, NY, USA, 2017. ACM. the overall MindReader system, it stands on its own as an intelligent [23] NGSS. Next generation science standard. hŠp://www.nextgenscience.org/, 2013. Accessed on April 19, 2017. web application that has signi€cant potential to help impart quality [24] M. Noone and A. Mooney. First programming language: Visual or textual? CoRR, computational thinking education. abs/1710.11557, 2017. [25] L. Porter, M. Guzdial, C. McDowell, and B. Simon. Success in introductory programming: what works? Commun. ACM, 56(8):34–36, 2013. REFERENCES [26] A. Vihavainen, J. Airaksinen, and C. Watson. A systematic review of approaches [1] D. W. Barowy, E. D. Berger, D. G. Goldstein, and S. Suri. Voxpl: Programming for teaching introductory programming and their inƒuence on success. In ICER, with the wisdom of the crowd. In Proceedings of the 2017 CHI Conference on pages 19–26. ACM, 2014. Human Factors in Computing Systems, Denver, CO, USA, May 06-11, 2017., pages [27] C. Watson and F. W. B. Li. Failure rates in introductory programming revisited. 2347–2358, 2017. In ITiCSE, pages 39–44. ACM, 2014. [2] CCSS. Common core state standards initiative. hŠp://www.corestandards.org/, [28] C. Watson, F. W. B. Li, and J. L. Godwin. BlueFix: Using crowd-sourced feedback 2010. Accessed on April 19, 2017. to support programming students in error diagnosis and repair. In ICWL, volume [3] F. Chen and S. Kim. Crowd debugging. In Proceedings of the 2015 10th Joint 7558 of LNCS, pages 228–239, 2012. Meeting on Foundations of So‡ware Engineering, ESEC/FSE 2015, pages 320–332, [29] D. Weintrop and N. R. Holbert. From blocks to text and back: Programming New York, NY, USA, 2015. ACM. paŠerns in a dual-modality environment. In SIGCSE, pages 633–638. ACM, 2017. [4] D. Cook. Flowgorithm home page. hŠp://www.ƒowgorithm.org/. Accessed: [30] J. M. Wing. Computational thinking, 10 years later. hŠps://tinyurl.com/yapf5zas, June 17, 2018. Mar. 2016. Accessed: August 31, 2017. [5] S. Ferran,´ A. Beghelli, G. H. Canepa,´ and F. Jensen. Correctness assessment of a [31] C. Yen and T. Wang. Using self-explanation and ontology for providing proper crowdcoding project in a computer programming introductory course. Comp. feedbacks in a programming environment. In 6th IIAI International Congress on Applic. in Engineering Education, 26(1):162–170, 2018. Advanced Applied Informatics, IIAI-AAI 2017, Hamamatsu, Japan, July 9-13, 2017, [6] J. Ferrante, K. J. OŠenstein, and J. D. Warren. Œe program dependence graph pages 585–590, 2017. and its use in optimization. ACM Trans. Program. Lang. Syst., 9(3):319–349, 1987. [32] X. Zhang, G. Cheng, and Y. ‹. Ontology summarization based on rdf sentence [7] F. Giunchiglia, A. Autayeu, and J. Pane. S-match: An open source framework for graph. In Proceedings of the 16th International Conference on World Wide Web, matching lightweight ontologies. Semantic Web, 3(3):307–317, 2012. WWW ’07, pages 707–716, New York, NY, USA, 2007. ACM. [8] P. J. Guo. Online python tutor: embeddable web-based program visualization for cs education. In Še 44th ACM Technical Symposium on Computer Science