Learning to Program Using Online Forums a Comparison of Links Posted on Reddit and Stack Overflow
Total Page:16
File Type:pdf, Size:1020Kb
Learning to program using online forums A comparison of links posted on Reddit and Stack Overflow by Caroline D. Hardin A thesis submitted in partial fulfillment of the requirements for the degree of Master’s (Curriculum and Instruction) at the UNIVERSITY OF WISCONSIN-MADISON 2015 Date of the final oral examination: 8/19/2015 2 Abstract: The Internet offers a vast number of websites with content for learning programming. While the United States struggles with a mis-match between the number of people who want to learn computer programming and the formal educational opportunities to do so, online informal communities of learning are growing. Online educational resources also have the potential to help mitigate some of the diversity issues plaguing traditional CS education. While popular press articles and Google searches are one way to sift through the many online learning opportunities available, a more authentic measure may be to examine the links posted on technical forums. I wrote a program to scrape and analyze thousands of posts on two of the most popular Internet forums - Reddit and Stack Overflow - and compare how they differ in answering the question of ‘how to get started learning to program’. Understanding how these communities talk about the available online resources demonstrates the different interests and priorities each community has about what is desirable to learn. In addition, these findings have many practical implications for those interested in computer science education, whether as learners, teachers, or creators of educational content. 3 Introduction: high demand and low access in computer science education If we look at the six month average for Google searches for the phrase ‘learn programming’ from 2007 to the present day, we see an impressive 54% increase (Google Trends, n.d.). The number of high school students taking the AP Computer Science exam has also been increasing over the last 15 years (Ericson, 2014). Figure 1: The number of students taking AP CS exam has increased by more than three times since 1998 At this time of rapidly increasing interest in computer science education we are also facing a shortage of traditional, formal educational opportunities for learning it. In 2013, out of approximately 37,000 high schools (US Department of Ed, n.d.), only 2,246 participated in the AP CS exam (Ericson, 2014), which “is a reasonable proxy for the number of AP CS teachers in the country” (Guzdial, 2014b). For a national population of about 16 million high school students, only 29,555 took the AP CS exam (US Department of Ed, n.d.; Ericson, 2014). Cassidy (2013) placed the percent of high schools offering any computer science at less than 10%. Despite these 4 limitations the number of students taking the exam continues to increase. Computers play an increasingly important role in most aspects of our society, including education, career development, civic participation, entertainment, and creative expression. It must be recognized, however, that how computers can be used depends largely on the computer programs available, and that is often dependent on who is making the programs. Therefore, the best way to ensure the availability of computer programs which serve the widest possible range of people is to have a wide range of diversity among those who create the programs. The availability of computer science education is often limited in ways which restrict diversity. For example, its availability is more prevalent in areas of higher socioeconomic status, as shown in Figure 2. The fact that high school classes are not serving a diverse population can also be observed by analyzing which students takes the AP CS test (Ericson, 2014; Margolis et al., 2012). Limited access to CS education in primary and secondary school has an impact on who is likely to be successful in college CS courses, as these courses often function on a model which “presumes prior experience in order for the student to succeed” (Guzdial, 2014b). The expectation that students have prior experience is built into the classes partially because colleges and universities have more students interested than they are able to accommodate (some schools accept less than 20% of applicants (Lohr, 2015), causing “students to characterize introductory courses as weed-out courses” (Lewis, Yasuhara, & Anderson, 2011, pp. 10). Subsequently, students are less likely to major in CS if they have no prior experience (Lewis et al., 5 Figure 2: See how as the median income increases the number of students who take the AP CS exam increases. Data from (Ericson, 2014; US Census Bureau, n.d.) 2011, pp. 10). Furthermore, for these students without prior experience college CS courses can be intimidating, as the students face a classroom culture which places high status on demonstrating prior experience. In order to improve the chances for these students to persist and become successful in computer science it is critical to expand prior opportunities for exposure to computer science concepts. 6 In addition, the shortage of opportunities to learn CS from traditional, formal pathways contributes to a cultural friction between those who had those opportunities and those who are self-taught, as “problems can arise when students confuse the source of knowledge that can lead to high status: intelligence versus experience. This is especially problematic for those with less experience, a group to which most female CS students belong” (Barker, Garvin-Doxas, & Jackson, 2002, pp. 45). This is compounded by the tensions around ‘fixed mindset’ versus ‘growth mindset’ - the theory that people are on a spectrum as far as how much they believe their skills and talents are innate and immutable versus how much they believe they can grow and learn new skills (Dweck, 2006). In CS educational settings this is often manifested as a belief in a ’geek gene’ (Lee, Heeter, Magerko, & Medler, 2012). These tensions can be seen in the online communities of people learning programming. As one user posted on Reddit, “Going to school for computer science is not going to help you, I have worked in net sec my whole life starting at 18 and I can tell you in full confidence that it may lower your chances of getting the job you seek, the hard core net sec guys grew up caring about this stuff and look at those who are going to school for it outside their circles. ..Going to school for CS tells those who live and breath this field, the ones making 6 figures that the person did not know this field at the start of college and 2-4 years is not going to fix that learning from a bad system. .everyone who is entering college now to learn this is going to end up working under 7 those we grew up doing this...” [sic] The question ‘how do I get started programming?’ is so common on the www.reddit.com/r/learnprogramming community that it has a special ‘frequently asked questions’ page to answer it, saying it is “by far the most asked question on this qsubreddit” (Reddit.com, n.d.-a). At the top of the FAQ page is a link to where Cohen (2011) argues that the very premise of the question is passive, and he suggests the answer to ‘how do I learn to code’ is to, well, ‘learn to code’. This prevalence on forums of the inquiry ‘how do I get started?’, therefore, raises questions which have not yet been addressed by other research about how the millions of participants on these forums think about what it means to learn programming. As such, the primary research question is: • How do online programming forums function as learning resources? More specifically, the thesis focuses on two sub-questions: • How do the online resources suggested by the Reddit and StackOverflow com- munities differ? • What does that difference suggest about how people learn to program with these resources? 8 In order to answer the research questions, this work collected and counted the links which were posted in two forums (Reddit, StackOverflow) for learning programming, and then analyzed how often the site names were used in conversation in order to find patterns of activity which demonstrate the learning strategies and values of these communities. With the proliferation of online resources for learn- ing programming and a burgeoning audience for these resources, the results have implications for both students and teachers. Why a focus on programming? Although programming, or coding, is only a small part of what comprises the discipline of computer science, it does play a disproportionate role in what the public thinks computer science is, and which computer science skills they think are useful and important to know (Lewis et al., 2011). Due to this perception, this work focuses on the large proliferation of online resources designed to specifically teach programming, including: blogs, podcasts, Wikipedia articles, YouTube tutorials, animations, MOOCs, games, interactive puzzles, competitions, video mentoring, forums for asking and answering questions, collaborative projects, and more. These resources complement traditional avenues - classes and textbooks - for learning programming. What online sites exist for learning programming? While it would be impossible to count the total number of sites which offer programming educational content, some have raised millions in funding and garnered 9 significant press. For example, Wortham (2012) states, “The sites and services catering to the learn-to-program market number in the dozens and have names like Code Racer, Women Who Code, Rails for Zombies and CoderDojo. But at the center of the recent frenzy in this field is Codecademy. Since the service was introduced [in 2011], more than a million people have signed up, and it has raised nearly $3 million in venture financing”. Other sites such as Code.org, which itself has raised over 17 million dollars in funding (Code.org Team, 2015; GuideStar, 2015), have been promoted by no less than the President of the United States (Mechaber, 2014).