<<

THE : APPLICATIONS IN COMPOSITION PEDAGOGY

By

Adam Danger Channel

A Project Presented to

The Faculty of Humboldt State University

In Partial Fulfillment of the Requirements for the Degree

Master of Arts in English: Teaching Writing

Committee Membership

Dr. Corey Lewis, Committee Chair

Dr. Suzanne Scott, Committee Member

Dr. Nikola Hobbel, Graduate Coordinator

May 2014

ABSTRACT

THE TESTING EFFECT: APPLICATIONS IN COMPOSITION PEDAGOGY

Adam Channel

This project advocates for the use of frequently-administered, low-stakes tests to enhance student of the disciplinary content of composition. Though there is widespread disdain for the role of standardized tests in education today, not all forms of testing are the same, and some forms of testing can be very effective teaching tools. Tests should ideally be locally generated with relevance to class content, frequently administered, and low-stakes, with feedback provided shortly after testing. This project lays the groundwork for how testing can dovetail into the student-centered dialogic classroom, a common practice in composition today. Theories of human learning

(including the testing effect and the ) show that active retrieval is the best way to ensure long-term retention and understanding. Frequent tests provide active- retrieval opportunities for students, which should enhance learning and retention. The project concludes with how to build tests specifically for the instruction of first-year college composition.

ii

TABLE OF CONTENTS

ABSTRACT ...... ii

TABLE OF CONTENTS ...... iii

TABLE OF FIGURES ...... v

CHAPTER 1: INTRODUCING TEST-ENHANCED LEARNING ...... 1

Retrieval-Based Learning ...... 1

Composition History and the Move Away from Testing ...... 5

The Banking Model of Education ...... 8

Critiques of Standardized Tests and High-Stakes Assessment ...... 11

CHAPTER 2: THE STUDENT-CENTERED COMPOSITION CLASSROOM ...... 15

Moving towards a Dialogic Classroom ...... 15

The Writing Process in a Dialogic Classroom ...... 23

Building Writing Fluency with Journal Writing ...... 27

The Dialogic Classroom and the Academic Discourse Community ...... 29

CHAPTER 3: HUMAN LEARNING ...... 35

The Physiology of Learning and ...... 35

Cognitive Definitions of Learning and Memory ...... 41

Three Theories of Learning ...... 47

Social Constructivist Theories of Learning and Testing ...... 53

CHAPTER 4: THE TESTING EFFECT AND THE SPACING EFFECT ...... 61 iii

What is the Testing Effect? ...... 61

Studies Reporting a Testing Effect...... 64

The Spacing Effect ...... 80

CHAPTER 5: TEST-ENHANCED LEARNING IN COMPOSITION ...... 89

Indirect Benefits of Testing ...... 89

Testing and Grades ...... 96

The Benefits of Tests According to Three HSU Professors...... 98

Building, Administering, and Grading Tests ...... 103

Recognition and Free-Response Questions ...... 107

WORKS CITED ...... 113

iv

TABLE OF FIGURES

Figure 1: “Neuron” ...... 36

Figure 2: “Neuronal Communication” ...... 38

Figure 3: “Stage Theory of Memory” ...... 42

Figure 4: “Performance on Immediate and Delayed Tests” ...... 66

Figure 5: “Testing Schedule Shows a Curve” ...... 68

Figure 6: “Study-Test-Study-Test (STST) Most Effective Learning Strategy” ...... 70

Figure 7: “Proportion Correct in Immediate and Delayed ” ...... 73

Figure 8: “Word Recall on Immediate and Delayed Tests” ...... 74

Figure 9: “Student Performance Averaged across Unit Exams” ...... 77

Figure 10: “Hypothetical 1” ...... 82

Figure 11: “Hypothetical Forgetting Curve 2” ...... 86

Figure 12: “Hypothetical Forgetting Curve 3” ...... 87

v

1

CHAPTER 1: INTRODUCING TEST-ENHANCED LEARNING

Retrieval-Based Learning

This project argues that student learning will be enhanced if teachers frequently administer low-stakes tests. This chapter will first define what a “test” is. It will proceed to describe a brief history of testing in composition, and finally it will rationalize the current lack of testing seen in composition practice today. Primarily, this chapter argues the lack of testing is due to three widely held misconceptions: (1) tests promote the

“banking model of education,” (2) tests do not encourage critical thinking, and (3) all forms of tests are subject to the same deficiencies as standardized tests.

What we know about the cognition of learning has considerably advanced in recent decades. Today we have a research-based theory of learning that is grounded in physiological and empirical data gathered from brain-imaging and cognitive studies.

Despite this, there remains a schism between the laboratory and the field, a gap between theory and practice (see . Recent advances in cognitive science and studies on memory and learning do not seem to have had a significant impact on composition pedagogy. This thesis argues that two theories of learning in particular are of the utmost importance but remain widely unknown: the testing effect and the spacing effect. Put simply, these theories hold that long-term retention is improved through repeated testing over time.

2

Composition practice today relies on a varied set of skills and knowledge; students’ learning and retention of this course content can be enhanced through the introduction of frequently-administered low-stakes tests.

It is important to define what is meant by the word “test” in this project, especially because for many people the term carries negative connotations. “Test” or

“testing” makes many people think of summative high-stakes assessment or top-down administered standardized tests. Both of these types of testing have serious deficiencies that will be elaborated on later in this chapter. Testing, however, can come in many different forms. For the purposes of understanding how tests operate cognitively, in this project the terms “tests” or “testing” are defined as “an induced act of retrieval”—any sort of material or question which necessitates a “retrieval” action on the part of the reader. Retrieval is the process of accessing information stored in the memory and articulating it, generally in response to an inquiry. You are asked a question, and you provide a written answer: that is the meaning of a test in this project. In order for us to understand how taking a test can enhance learning, it is important to distinguish between two types of learning: passive and active. Cognitive research would describe a textbook or reviewing class notes as “passive learning,” because reading is an input-only activity (Roediger and Karpicke 181; Knight and Wood 298). Learning through a lecture is similarly defined as passive learning for the same reason. However, if one was to convert their notes and annotations into flash

3

cards that they could test themselves on (with the possibility of failure to recall), that would be considered “active learning” because it would require an act of retrieval on the part of the learner. As another example, when we read the statement, “Black holes are the remnants of the gravitational collapse of a star,” our brain connects that to existing knowledge structures and schemata. We can read that same statement numerous times and recognize it each time; this type of learning is passive because there is no demand made on the brain to reproduce the information. On the other hand, active learning is demonstrated when that same statement is turned into a question by occluding keywords, like “______are the remnants of the gravitational collapse of a star” or “Black holes are the remnants of ______.” In this case the brain must fill in the blank with the correct information, necessitating an act of retrieval. This act of retrieval will be successful if students remember the relevant information (like the black hole example), but, as we all know, memory is not perfect.

There is a clear difference in difficulty between the two types of learning. Passive learning only requires recognition and comprehension; active learning, on the other hand, requires retrieval with an increased possibility of failure. The act of retrieval required to answer the fill-in-the-blank questions (also called “cloze deletion”) will produce better long-term retention than passive reading of the same statement. Veteran scholar on the subject of learning, professor H. L. Roediger of Washington University in St. Louis, describes it this way: “We are much more likely to remember something again if we

4

actively retrieve it than if we are passively exposed to it in restudying” (“Advice from

Cognitive Psychologist…”). This result has been found in many studies and is referred to

in the literature as “the testing effect” (these claims will be referenced and further

substantiated in Chapters 3 and 4).

Like all fields, composition has a set of specialized terms. One reason why

retrieval practice is important is that when students reach into memory and recall a term

or phrase from the class lectures or course , then it is no longer just the term that

was on the board or in the reading; it has moved toward becoming their own term. Once

students have internalized that term or phrase, they will be able to reproduce it in their

own writing and cognition. In composition, these terms help us communicate more

specifically about writing (comma splices, topic sentences, thesis, dependent clause, etc.).

Chapters 3 and 4 show that our ability to remember terms like these generally depends on the number of times we have retrieved them. Peer review is an integral part of modern composition practice, and internalizing these specialized terms will help students improve their peers’ writing, in addition to their own. Since it is necessary—or at least beneficial—to know these terms when discussing writing, a pedagogical method that is effective in helping students remember them would be of great benefit in composition.

This project argues that teachers can best facilitate students’ successful learning and long-

term retention of class content in composition by giving retrieval opportunities to

students through frequently administered low-stakes tests.

5

Composition History and the Move Away from Testing

For most of human history, story tellers in oral cultures would verbally repeat the extensive myths and epics of their people. This level of required great skill and tenacity, and could only be reinforced through continual testing and retrieval. In classical times, Memoria, or memorization, was one of the five canons of rhetoric described by Aristotle. In his essay on memory, he wrote, “Exercise in repeatedly recalling a thing strengthens the memory” (202). Scholars have known the power of testing, and written about it for centuries. Francis Bacon wrote in The New Organanon, published in 1620, "If you read a piece of text through twenty times, you will not learn it by heart so easily as if you read it ten times while attempting to recite from time to time and consulting the text when your memory fails" (143). In The Principles of Psychology

(1890), William James also argued for the power of testing through active recitation, writing: A curious peculiarity of our memory is that things are impressed better by active than by passive repetition. I mean that in learning (by heart, for example), when we almost know the piece, it pays better to wait and recollect by an effort from within, than to look at the book again. If we recover the words in the former way, we shall probably know them the next time; if in the latter way, we shall very likely need the book once more (646). These famous authors and others have written about the power of retrieval. Retrieval

6

practice has long been an acknowledged part of learning. The understanding that retrieval

is an essential part of learning should be a guiding principle in our teaching practice. This project urges for retrieval practice through the use of frequent tests; however, the role of testing in education is currently a fiercely debated subject. The passage of the

2001 No Child Left Behind act and the use of high-stakes standardized testing have pushed many people, including teachers, to reject testing. This rejection seems especially apparent among teachers in composition programs, many of whom follow Expressivist practices. With the Expressivist movement of the 60s and 70s, there came a rejection of testing and drilling in composition practice. Christopher Burham, in his chapter in A

Guide to Composition Pedagogies describes the divergence of two composition pedagogies: Expressivism and Current-Traditional Rhetoric (CTR). He describes

Expressivism as “The movement [that] originated […] as a set of values and practices

opposing current-traditional rhetoric” (Tate 21). Current-Traditional Rhetoric (CTR), a school of composition instruction developed in the early nineteenth century, relied heavily on a prescriptive notion that there is a syntactically and stylistically correct and incorrect way to write. The goal of CTR was to teach students how to conform to those standards through drilling and regular testing. The Expressivist movement in the 60-70s took a sharp turn away from testing and focused on “writers writing” rather than “writer’s writing”—or process rather than product.

Burnham describes two figureheads in the Expressivist movement, Donald

7

Murray and Ken Macrorie, as opponents of rules and directive feedback. He writes of

Murray's A Writer Teaches Writing (1968) that “Murray's use of non-directive feedback from both teacher and students turns the responsibility for writing back to the student”

(23). Expressivism is characterized by a focus on language as a tool for personal rather than social expression. The Expressivists believed that just getting students to write and write and write and to do so uninhibited by top-down rules was the best teaching practice.

The movement recognized that good writing needs more than good mechanics and syntax, so the focus of instruction shifted away from the then-prevalent practices of drilling and testing on rules of grammar and style. Composition instruction today owes a great deal to Expressivism, and many of our current practices stem directly from the

Expressivist school of thought. However, with the current lack of testing in composition, this project poses the question: "Did we throw the baby out with bathwater?" Though some expressivists would disagree, this project argues that there is value in teaching rules of writing. This disciplinary content is valuable because it allows us to engage in meta-discourse about the process and products of writing. Teaching the disciplinary content of composition through test-enhanced learning isn’t guaranteed to improve students’ writing abilities, but I argue that there should be positive transfer between the two activities. For example, if students can identify and correct a run-on sentence on a test, then they are more likely to be able to find run-on sentences and prevent them in their own writing. As another example, composition classes teach that

8

writing is a process with different stages, and each of those stages has certain activities that a writer can use. During pre-writing we can use free-writing, brainstorming activities, mind mapping, and outlining to help our writing. We teach these activities, but if students don’t have retrieval opportunities (i.e. if they do not practice outside class or on tests) it is unlikely they will remember all of these activities. Teachers use retrieval practice for multiplication tables, or learning a new language, but the idea that you can use it for more complex ideas is not widely appreciated. For example, when students learn words and verb conjugations for another language, Spanish for instance, they often use flash cards with the English word on one side and the Spanish corollary on the other, and teachers say, “Practice until you really know it. Practice until it’s completely automatic.” In composition practice we want the same type of automaticity with writing terms and practices. This project argues that this type of automaticity will be produced with frequently-administered low-stakes tests.

The Banking Model of Education

In the 1970’s influential author Paolo Friere critiqued the common practice of lecturing, arguing that contemporary academic systems teach “the banking model of education.” In the banking model, the teacher is the exclusive authority who stands at the front of the class while students sit in desks, all facing forward, posited as empty vessels waiting to be filled by the teacher with knowledge. Friere critiques such authoritative

9

models of education, writing

Instead of communicating, the teacher issues communiques and makes deposits which the students patiently receive, memorize, and repeat. This is the "banking" concept of education, in which the scope of action allowed to students extends only as far as receiving, filing, and storing the deposits (58).

Friere argues that this model does not foster critical thinking skills in students and conditions students to unquestioningly trust teachers and classroom content.

Unfortunately, many consider teaching through tests to be a variation on the banking model, arguing that it leads to convergence on a single answer given by an authority figure, or they argue that teaching through tests does not lead to the development of critical thinking. The banking model of education positions students as the passive recipients of education, and is therefore not structured in a manner that encourages the development of critical thinking skills. This history of valid critiques of the banking model of education has unfortunately led many to dismiss all forms of testing as falling within the banking model. This blanket dismissal of all forms of testing is problematic because it discards forms of testing that can improve critical thinking and enhance student authority.

Our educational systems today, and the role of tests as a teaching tool, are influenced by history. Gordon Wells of the University of Toronto, writing in the anthology Vygotskian Perspectives of Literary Research, explains that universal public education through mandatory attendance at school is a “historically and culturally

10

localized activity system that owes more to models of industrial mass production than to that of development through assisted participation in social activity” (Lee 59). In his widely recognized 2010 TED talk, “Changing Education Paradigms” (with over 13 million views to date on youtube and TED.com), Ken Robinson draws a parallel between the school and the factory-line—both have hyper-specialization, ringing bells, and separate facilities. In this model, testing is seen primarily as a means of assessment. Like a quality control stamp in a factory, tests are commonly used as a means to categorize and rank students by their performance. The factory and the banking models of education, which many students today have experienced, position students as the inheritors of knowledge, rather than the co-producers of understanding.

However, teaching through tests is not necessarily a banking model practice. If done correctly, test-enhanced learning can encourage creativity and facilitate the student- directed classroom that modern composition instruction relies on. Teaching with tests may seem to be opposed to the development of critical thinking skills; perhaps this in part explains the current lack of testing in composition. Indeed, tests should not be considered the end-all solution, and instead should be seen as a supplement to a wide variety of other instructional methods. Frequent tests will enhance learning, which will increase student authority. When used alongside a variety of other teaching practices, this should enhance the development of critical thinking.

11

Critiques of Standardized Tests and High-Stakes Assessment

In the abundant critiques of testing today, it appears that many are conflating the issues seen with standardized tests to all forms of testing. There are many different forms and uses of tests, however, which can eliminate the myriad disadvantages of top-down standardized tests. The type of testing this project advocates avoids these problems because every test is designed by the local teacher specifically for his or her students and curriculum. In addition to being locally generated, tests should also be frequently administered. This section will consider some of the most significant deficiencies of standardized tests and the changes that standardized testing have brought to the education system. Such a consideration is necessary because it will show us by example what not to do, and help rationalize the current lack of testing seen in composition.

Federally administered standardized tests have been a compulsory part of K-12 education since the passage of the federal Elementary and Secondary Education Act of

1965. The 2001 revision, No Child Left Behind, increased the importance of standardized-test results in determining allocation of federal funding for schools, and by extension, the importance of receiving federal funding has driven schools to pressure teachers to emphasize these standardized tests in their classrooms. According to Dianne

Ravitch, previous U.S. Assistant Secretary of Education, “So now we have schools being closed and people getting bonuses all around the student test scores. It’s made testing, somehow, the central activity of American public schools today, which is just so wrong”

12

(Kastenbaum “The high stakes of standardized testing”). Ravitch is a former political figure and frequently seen pundit on the subject of testing and education in the media today. She is representative of the many people who feel that standardized tests are an encroachment on the rights of local teachers.

Focusing instruction around top-down imposed standardized tests limits teachers’ choices for class materials and methods of instruction and encourages “teaching to the test.” Because schools are held accountable for their students’ performance on standardized tests, if the students perform poorly, then schools blame their teachers.

Andre Perry writes in an article titled “Education Reform Starts with Community

Reform” a succinct critique of standardized tests that is worth quoting in full: We currently use standardized tests well beyond what they were designed to do, which is measure a few areas of academic achievement. Achievement tests were not designed for the purposes of promoting or grading students, evaluating teachers or evaluating schools. In fact, connecting these social functions to achievement test data corrupts what the tests are measuring. In statistics this is called Campbell's Law. In other words, what does a score measure after it has been connected to a teacher's pay or job status? In education talk, this is called teaching to the test, hiring to the test, and getting paid to the test (“Education Reform”). Many of us are critical of standardized tests because not only do top-down imposed tests infringe on local school boards’ ability to design curriculum, they also modify the very nature of public schooling.

In addition, according to a 2012 New York Times article, the increased emphasis on standardized test scores has played a large contributing role in the 20-year-low in

13

teacher morale:

The slump in the economy, coupled with the acrimonious discourse over how much weight test results and seniority should be given in determining a teacher’s worth, have conspired to bring morale among the nation’s teachers to its lowest point in more than 20 years, according to a survey of teachers, parents and students released on Wednesday (Santos "Teacher Survey Shows Morale Is at a Low Point.”). The article argues that this low morale is due to unprecedented economic hardship for

teachers. Tied in with this is the heated debate on the role of testing in education and how

students’ test results reflect on their teachers’ performance. Given these facts, low teacher

morale does not seem so surprising. It is perhaps, then, also not so surprising that there is

a widespread disdain for testing among the general populace and teachers today.

This chapter has introduced the wider context of testing and its contentious role in education today. Though many teachers see testing as another example of the banking model, there is no intrinsic deficiency in the practice of testing. By analyzing the role of standardized tests in education today, we can see that tests are frequently misused and too often they play too large a role in assessment. Though tests can be a useful form of assessment, their greater purpose is as a means of retrieval practice. Learning through retrieval has been an accepted practice since ancient history. Rather than using a few high-stakes exams as an exclusive means of assessment, the best practice is frequently- administering low-stakes tests to provide numerous retrieval opportunities for students.

This will encourage frequent studying rather than cramming on the part of students. This

14

process of frequent studying and frequent testing will reinforce and enhance student learning of course content.

15

CHAPTER 2: THE STUDENT-CENTERED COMPOSITION CLASSROOM

Moving towards a Dialogic Classroom

“What kind of learning do we want?” This is a question that every teacher should thoughtfully consider. To be more specific, this chapter is guided by the question “What kind of learning do we want in a first-year college composition course?” A review of the practices and ideologies of modern composition will help us understand how testing can dovetail into that practice. Many composition courses use a student-centered model of instruction where the classroom is a community in which students collaboratively work and learn. Though classes in other disciplines are frequently lecture based, in composition the dialogic class generates peer engagement, which helps to establish a spirit of collaborative inquiry, which should foster creativity. Writing abilities are developed gradually through successive drafts of major essay assignments with peer review between drafts, and through in-class writing exercises. The focus of student-centered instruction includes the use of peer work and collaborative learning. It is important to realize that, as

John-Steiner and Meehan, writing in the anthology Vygotskian Perspectives, conclude,

“Social interaction and mutual support lead to creativity in a multi-directional dynamic exchange” (Lee 40). This creativity is fostered in the composition classroom, which uses collaborative learning methods such as peer workshops, group and partner work, and

16

group discussion. Collaborative engagement is already supported in scholarly literature as an effective learning method. For example, via the studies of Dr. M. L. J. Abercrombie,

Kenneth Bruffee, writing in “Collaboration and the Conversation of Mankind,” shows the benefits of collaborative learning. Abercrombie's Anatomy of Judgment (1964) synthesized ten years of observation and research on medical interns studying diagnosis.

Typically, the interns worked as individuals, but Abercrombie asked the interns to all examine the patient together, discuss the symptoms as a group, and arrive at a consensus on which all could agree. After observing the interns diagnose both individually and as a group, Abercrombie concluded that the collaborative model resulted in a more accurate diagnosis, and faster acquisition of diagnostic skills on part of the interns. Bruffee argues that this same model of collaboration can also extend to the writing classroom, there too bringing about better learning and, in this case, better writing. In addition, Bruffee argues that writing is a conversation, not only with those around us, but also with ourselves. In our daily lives, "we internalize conversation as thought; and then by writing, we re-immerse conversation in its external, social medium"

(88). Many forms of thinking are acts of conversation in which our thoughts are revealed either with the written or spoken word. The conversation in our minds is enhanced by input from conversations outside and collaboration; following this maxim, composition practice today favors group work and discussion over lecture as a means of instruction.

17

One of the great advantages of these forms of collaborative learning is that they are an

active form of learning as opposed to passive forms of learning. As noted earlier, active

forms of learning, such as when students take a test or participate in a discussion are more effective than passive forms. Using frequent tests in composition, then, can be seen to accord with the current theory and practice in the field.

Discussion with the class at large, and in small groups, leads to better learning than lecturing. This is the position espoused by Donald Finkel in his book, Teaching with your Mouth Shut; he quotes a summary of research by the National Council on

Education, the NCE, as follows: Research clearly favors discussion over the lecture as an instructional method when the variables studied are retention of information after a course is over, transfer of knowledge to novel situations, development of skill in thinking or problem solving, or achievement in affective outcomes, such as motivation for additional learning [...]—in other words, the kinds of learning we most care about (qtd. in Finkel 3).

The NCE’s research provides a persuasive argument in favor of a dialogic classroom—a classroom in which dialogue is the primary means of instruction rather than lecture. In the dialogic composition classroom, readings are discussed rather than lectured on, and peer-review serves as a means of engaging in conversations about writing strategies.

Gordon Wells, of the University of Toronto, in his chapter in the anthology Vygotskian

Perspectives, describes a dialogic class as one where “the classroom is seen as a collaborative community. The community works towards shared goals of achievement

18

and its success is dependent on the group rather than individuals” (Lee 65). In this

collaborative community, Wells argues that “the teacher should be involved as a co-

inquirer with the students” rather than an exclusive authority in the class. The goal in

such a class is to create what Wells calls “an ethos of collaborative inquiry” (ibid).

The notion of decentralizing power challenges long-entrenched beliefs about

teaching; our cultural expectation of a teacher is somewhat akin to a performer: a brilliant

lecturer who captivates an audience of students. However, modern composition

recognizes that this cultural construct is limiting. The authority of the professor, in large

part, is derived from their extensive study and knowledge. Decentralizing authority in the classroom requires that students have some understanding of course material. The dialogic class provides more opportunities for students than a lecture-based

class, which is great step in the right direction, but the teacher can even further enhance

student learning by providing a structure for recall of the key terms used in class through

frequently administering low-stakes tests. Frequent tests engage active learning strategies

with students, and since students need to have an understanding of disciplinary terms in

order to participate effectively in these discussions, testing can be used to enhance

student understanding, which will in turn increase student authority.

Creating a successful dialogic classroom requires the teacher to take certain steps

to facilitate collaboration. For example, the first day of class is extremely important—this

is when first impressions are formed, so it is important to make these impressions

19

positive. An excellent way to immediately foster the spirit of collaborative inquiry is by setting ground rules for discussion as a group. The instructor can solicit feedback from the group, write suggestions on the board, and then, through a discussion of the purpose and objectives of the class, the group as a whole can come to an agreement of what ground rules are most important. Some instructors even take the process a step further and involve students in syllabus formation. I know one composition instructor who includes her students in the decision making process for how heavily weighted assignments are and what reading material the class should cover. A student’s involvement in the formulation of the rules will undoubtedly also provide motivation and incentive to abide by these rules. In these dialogic classrooms then, students are not only active learners, but also active participants in creating classroom rules. For example, one method to improve the diversity of tests, increase student engagement with tests, and encourage students to thoughtfully consider the test questions, would be having students devise their own test questions as homework. Instructors can have each student write four or five multiple choice questions as a homework assignment. Writing items will help students learn the material, and get involved in the test-making process. The instructor could select among this set of questions and, if appropriate, present them to the class as a type of student-generated exam. In that way students would remain exposed to a variety of authorities and the class would move closer towards student-centered practice while also providing frequent retrieval opportunities of course content.

20

In his book Teaching with your Mouth Shut, Donald Finkel shares some of his teaching practices that create a dialogic classroom. Though he teaches literature, not composition, there are many qualities of his dialogic class that instructors can directly apply in their composition courses. Finkel’s literature class contains twenty-five students and a teacher who meet around a single round table. There is no clear locus of in the class—no lectern or podium—and the teacher sits at a different seat every day. A typical day begins with a student soliciting the other students’ discussion questions that they have brought to class and writing them on a whiteboard. The class then decides in what order they would like to discuss the different questions. Once the decision is reached, the facilitating student sits down and the discussion begins. The benefit of having a student solicit questions rather than the teacher is that it helps direct focus away from the teacher and on to the questions the class will be discussing—like frequent tests; these strategies require students to be active, not passive learners. Finkel argues that there are four additional advantages of this process of beginning the class by soliciting questions. The first is that it prevents the first student willing to talk from taking the reins for the rest of the class. This is a crucial step towards creating a space for individuals with different communicative styles—from shy students who only chime in after lingering pauses to boisterous students who do not tolerate the slightest conversational gap. The second advantage is that it allows students to hear a number of questions about texts before discussing any one of them. Some students’

21

questions will invariably be well-formed and others will not, which leads to the third

advantage: it emphasizes the importance of bringing thoughtful questions to class.

Hearing the multiple interpretations of the reading brought in by different students is a

good practice to develop divergent thinking—the ability to see multiple solutions to a problem. Hearing a single interpretation, on the other hand, given to the students by the authority of the teacher, may encourage the development of convergent thinking—

convergence on a single possible answer. Additionally, when students know that their

discussion questions will be evaluated by their peers, it provides motivation to write well- thought and well-formed questions to bring to class. The fourth advantage Finkel describes is that the ritual of writing questions on the board helps transition students from the hustle and bustle of wherever they were before class into an atmosphere of thoughtful inquiry. With all of this focus on the students, one may ask what the teacher’s role is in this dialogic class. If the class functions seamlessly without direction from the teacher, then is there no need for one? Finkel answers this question thusly: “’Teaching with your mouth shut’ does not entail teacher passivity; it requires different kinds of activities from teachers” (17). Finkel further argues that “Good teaching is the creating of those circumstances that lead to significant learning in others” (8). To create those circumstances during class discussion, the teacher should serve more like a facilitator than a director. A facilitator coaxes students towards learning and has an idea of how that

22

will happen, but is not overly attached to a single course of action so long as learning is taking place. A facilitator is open to “teachable moments” in class, and is willing to cede authority when it is appropriate. A director, on the other hand, is at the top of a clear hierarchy and seeks to impose his or her own particular order. Some of the ways a facilitator might benefit class include, if class discussion begins to die down or becomes unfocused, the teacher can recommend that they discuss a different topic, or if an insightful comment is ignored, then the teacher can function as a spotlight to redirect attention towards it. Additionally the teacher is involved in the conversation and can ask his or her own well-formed questions, which can serve as a model for students. Finally, at the end of the day, the instructor can summarize the results of the discussion and emphasize some key points that were made.

The dialogic classroom de-centers focus from the instructor and creates a student- centered class. Since group interaction is a focus in composition, students frequently interact with their peers without the teacher being directly involved. For these interactions to be successful, students must have some competency with the disciplinary content of composition. As teachers, we want to prepare our students for peer review and group discussion by teaching them the terms that allow them to critically evaluate and discuss writing. The dialogic classroom puts students “at the wheel,” so to speak, which requires disciplinary fluency from students. To facilitate the most productive conversations in composition, teachers want to ensure that students have a solid

23

understanding of course content, and are able to reproduce it in their own words. Students

may initially be exposed to terms of critical analysis through a mini-lecture, or through

readings, or class exercises, and ideally they will continue to use those terms during

group discussion and peer review. However, in addition to these teaching techniques, the

teacher can provide frequently-administered, low-stakes tests to help ensure students’ long-term retention of course content. As can be seen, then, the practice of teaching through frequent low-stakes tests accords with current theory and practice in composition. It is an active learning strategy that will enhance student knowledge and authority and support the student-centered classroom.

The Writing Process in a Dialogic Classroom

Lad Tobin writing in A Guide to Composition Pedagogies describes two modern

approaches to writing instruction and defines them as “process” and “post-process”

writing pedagogy. Process-based pedagogy urges instructors to devote much class time to

peer review of student works-in-progress. The writing process is also explicitly taught,

and the methods that successful writers use for invention, focus, and organization are

taught as models for students to follow. Much time is devoted to in-class writing

exercises to help students practice free-formed writing on demand. The process-oriented

class is guided by an emphasis on pre-writing strategies, and revision after feedback.

24

Many instructors teach the process by requiring multiple drafts and conducting peer

review between successive drafts, which models the process as recursive. Tobin

distinguishes the post-process model of instruction, which differs in some key aspects.

Post-process teachers typically assign more reading and devote class time to group

discussion rather than peer review. Readings are analyzed in class to identify the effective

characteristics of each piece, and class time is also devoted to the teaching of rhetorical

conventions and genre analysis in a post-process model. Regardless of whether one

focuses on process or post-process pedagogies, we can, in addition to testing students on

grammatical rules and writing vocabulary, also use tests to ensure students understand the

different stages of the writing process and the kinds of rhetorical moves writers use in

different genres.

In addition to being dialogic, the modern composition classroom tends to include

a large amount of active writing time, and most composition instructors teach the writing

process as a recursive process. Whereas the linear model of (1) Outline, (2) Draft, (3)

Revise, (4) Proofread, (5) Publish, has historically been the prevalent model taught in

schools, it is today accepted that this linear process does not reflect the reality of writing

for most people. The linear model encourages a “once-and-done” method of writing.

Rather than this linear model, composition instructors model the writing process as recursive by requiring multiple drafts of a single paper. Requiring multiple drafts ensures

25

that students actually engage in the writing process rather than wait for “inspiration,” or

the night before the deadline.

Requiring student papers to be submitted in multiple drafts encourages

incremental improvement over time and helps students manage their time and prevent

procrastination. Ralph Keyes, author of The Writer’s Book of Hope (2003), advises us to

strive towards consistency in our writing practice: “Serious writers write, inspired or not.

Over time they discover that routine is a better friend to them than inspiration” (49). I argue that this concept can be extended to testing—just as writers benefit from schedule and routine, so too will students’ learning be enhanced through frequent tests. Over the course of a semester, as students learn more, and work with feedback on their writing, their writing will gradually improve. The essay becomes a living thing that grows over time. The linear model of writing leads to the common reality of students writing their essays during all-night sessions the evening before they are due. If an instructor does not require further submission, then students take that to mean that no further revision is necessary. Finkel's multi-centered or dialogic form of discussion is a natural outgrowth of

the peer-review workshop famously known as the Iowa Writer's workshop. The

workshop is a common instructional approach in composition, and, like the dialogic

classroom, it relies on the teacher as a facilitator and on student-generated knowledge.

Classrooms structured around a community of equivalent peers benefit students' learning

26

because they allow for students to have conversations about writing. Like other

specializations, composition has specialized terms and knowledge, and facility with these

terms is critical if workshops are to be effective. By testing on these terms and the writing

process itself we can ensure that our students get the most benefit from our workshops

that is possible. This external conversation also benefits our internal conversations and enhances

our cognition. Mastery of jargon allows meta-cognition, “the process of reflecting on and directing one’s own thinking” (National Research Council 78), a key skill which enhances learning. The language which students practice during peer review also allows them to engage in meta-cognition about their own writing. Peer review between successive drafts helps students improve their writing gradually. The practice of peer review acknowledges that fellow students can be sources of knowledge in a classroom as well as teachers. Students are only in composition classes for a semester or two and assumedly we want them to grow past needing the teacher—that will only happen if students are allowed to retrieve and reproduce class material by expressing it in their own words. Conducting peer-review and maintaining a student-centered class should help produce better writers and encourage creative and critical thinking. Frequently administered tests, given alongside writing assignments over the course of the semester, will supplement the peer-review process by enhancing student learning and retention of new terms and writing techniques, making students more capable at peer-review.

27

Building Writing Fluency with Journal Writing

Gerald Fleming and Meredith Pike-Baky together authored Rain, Steam, and

Speed, a book devoted to improving writing skills through journal writing. In the book

they define fluency as “the ease with which one communicates in each of the language

skills” (14). For an orator, fluency is the ability to deliver a good speech, whereas for a

reader, fluency is the ability to read steadily and understand what is read. Writing fluency

is described in the book as “practiced, prolific writing [that] keeps language and

perceptions flowing past the fidgets, self-distractions, and bogeys that the mind occasionally throws out when it doesn’t care to work” (14). Practicing sustained writing is an important step to improving written communication skills; it is something like endurance training with words. The training metaphor helps us envision writing as a developmental process. In the same way that an athlete does strength training to build muscles, it seems obvious that practice with sustained, focused writing is a necessary step to improve writing fluency. In addition to journals, students can be directed to do in-class writing through short-answer or essay questions on frequently administered tests. In this way, the same training metaphor applies to tests, which shows that testing dovetails with our current practice. In-class writing through prompts and journal writing is a valuable method of instruction—this is in line with Expressivist theory. Expressivist pedagogy maintains that all students have interesting things to say and only need the teacher to “get out of the

28

way” so to speak, so they can write with their own authentic voice. For the expressivist, top-down rules serve as blocks to progress—an Expressivist might say that teachers tell students that they cannot write with every stroke of the red pen, and then teachers complain "students don't want to write. How can we motivate them?” Lad Tobin, writing in A Guide to Composition Pedagogies argues, Children want to write. They want to write the first day they attend school. This is no accident. Before they went to school they marked up walls, pavements, newspapers with crayons, chalk, pens or pencils . . . anything that makes a mark. The child’s mark says, "I am." "No, you aren't," say most school approaches to the teaching of writing (Tate 19). Expressivist pedagogy is about rekindling this passion seen in children and getting students excited about writing. Expressivists maintain that teaching writing is not about teaching new strategies or rules, but facilitating the unbound and unrestricted production of copious amounts of text. Peter Elbow, a front-figure of the Expressivist movement writes in his book Writing

Without Teachers, I try for two things: (1) to help you actually generate words better--more freely, lucidly, and powerfully: not make judgments about words but generate them better; (2) to help you improve your ability to make your own judgment about which parts of your own writing to keep and which parts to throw away (vii-viii). Tobin summarizes Elbow’s suggestions for how to accomplish these goals as follows:

“Elbow suggests that writers free-write (write non-stop without worrying about correctness, form, logic, etc.); play with words and ideas; form writing groups; and rely

29

less on doubting and more on believing, less on criticism, more on imagination” (Tate 3).

This “believing game,” as Elbow calls it, begins first with students believing in

themselves, and with teachers encouraging that belief.

Journal writing and timed writing exercises are common practices in modern

composition. The seminar-based class is a great place for students to develop critical

thinking skills and come up with innovative ideas to write about. In-class free-writes and

journal writing, in which students simply practice writing without any assessment, make

students comfortable with the process of writing. Frequent repetitive writing as in journal

writing can also be done in class with directed writing prompts and short or long answer

questions on quizzes. In a similar manner, frequently used tests can actually be seen as a way to formalize and or supplement some of our current practice in composition such as the use of journals or in-class writing.

The Dialogic Classroom and the Academic Discourse Community

The student-centered dialogic classroom requires students have enough

knowledge to work together without the single authority of a professor. In order to

collaborate, students need a common vocabulary and basic understanding of course

content. This project argues that teaching through frequently-administered low-stakes tests can help ensure fluency with these concepts through active retrieval. In a similar manner, I contend that composition instructors need to teach the conventions of Standard

30

American English (SAE) and can use tests to do this.

The explicit instruction of SAE has been the subject of much debate in

composition. Expressivist theorists argue that we should have students focus solely on writing without constraints and rules. Though some expressivist pedagogues argue that we should focus exclusively on the generation of content, I argue that we do students a disservice by neglecting to teach the rules and conventions of academia and SAE. In our own composition program at Humboldt State University, students need to submit a writing portfolio, which is assessed to determine if students pass the course or not.

Adherence to style guidelines like MLA and the conventions of SAE are a necessary component for passing the portfolio. Additionally, students will no doubt be expected to write fluently in SAE in the rest of the academic and the business world outside the composition course. Part of the goal of a first-year composition course therefore should be to integrate students into this aspect of the larger academic community as well.

Teachers should view fluency with SAE as an ingredient of good writing, but not the only ingredient, though a valuable one nonetheless.

For many students, mastery of these academic arts can be challenging. The authors of They Say, I Say, Gerald Graff and Cathy Birkenstein, consider academic writing as a process that can be learned in stages. The book serves to, as the title page describes, "demystify the moves that matter in academic writing." Each chapter discusses different rhetorical strategies that authors can use and offers templates that utilize those

31

strategies. Writing templates are a great way to interpret your own internally persuasive

dialogue—developed through practice with journal writing or free-writes—into academic language. The templates in this book help students learn how to make logical transitions and make their sentences work together in cohesion to convey a larger idea. Many first-

year college composition classes now use sentence templates as models. Students’

development as writers is assisted by these templates, which show them how to turn their own language into academic writing. They Say, I Say is only one popular example, but there are many composition text books containing sentence templates that could easily be adapted to fit in tests.

Connors and Lunsford (1988) published a study of the historical frequency of errors made in freshmen papers from colleges around the United States. They compared these results with a review of the most common errors from 1917 to 1988. The study found that the frequency of errors has remained consistent historically at an average of

2.2 errors per 100 words. The study reports that the most common contemporary errors

(at publication) include spelling, missing inflections, apostrophes and commas, and misused homophones. Though we can use this list of frequent errors, and the many others like it, as guidelines, good grammar instruction is effective only when it is individualized and adapted to context. For example, it is standard practice that if a teacher sees that a large portion of the class has difficulties with comma splices, then he or she runs a mini- lesson on sentence boundary rules. After the lecture, students then workshop to search for

32

errors related to that rule during peer review. Later the teacher could administer tests with

questions on sentence boundaries and comma splices and with incorrect examples that

need correction. This additional use of tests could enhance student learning and dovetail

easily with current teaching practices. Like spoken language, writing requires continuous adjustment to the customs,

constraints and expectations of different discourse communities. The language that is

effective in the classroom is not the same as the language that is used in the courtroom, the home, or the sports bar, and will not be effective in those contexts. As speakers, we intuitively switch between these various registers and modify our speech to meet the needs of the situation. In writing, these conventions and formal expectations are described as genres. Critical analysis of different genres requires a specialized language

(e.g. audience, purpose, use of logos, ethos, pathos, etc.). Internalization of that language is a key step for adequate performance within a given genre.

Michael Carter and colleagues, in a white paper published by North Carolina

State University, argue that students learn best from genre models when instruction includes explicit analysis of the features of a genre. They write the “students may learn these genres through repeated exposure and trial and error, but explicit instruction can help them negotiate a variety of genres much more quickly and effectively” (6). They conclude that “there is little evidence to suggest that students will notice relevant features and apply them to their own writing situations without such intervention” (9). To help

33

successful transfer of skills between different genres, composition courses focus on

analyzing the conventions of these diverse discourses and replicating them in written

assignments. However, as Carter and his colleagues note, this may not be enough. Each

genre is made up of multiple components, some of which include audience and purpose.

Fluency in one genre does not necessarily translate to fluency in another genre. Therefore

it is important to teach students that there are different expectations for writing and

thinking across different genres and disciplinary and professional cultures. To aid a

student’s transfer of skills from one writing context to another, many composition

instructors have adopted a genre-based approach to writing instruction. If we want to prepare students as best we can to meet the diverse writing needs of the university in the short course of a college semester, then we should focus instruction on genres that tend to cross disciplinary boundaries—such as the research paper, the summary of assigned reading material, the professional letter, or the persuasive essay. Time spent on instruction of these types of writing will be well spent because these skills can easily be adapted for transfer across disciplines, and, of course, the conventions of these genres can be explicitly taught through lecture and student retention can be reinforced through frequent tests. Some degree of fluency with SAE is a basic expectation for published works, along with the ability to conform to research style guidelines (MLA, APA, etc.) and the conventions of different genres of academic writing (a formal research paper, an

34

annotated bibliography, a business letter, etc.). A student-centered classroom can help foster creative exchange and frequent writing will help build familiarity with the writing process. However, these ingredients are not enough. To become capable of effective peer-review and to meet the needs to the university, students also have to internalize a great deal of disciplinary knowledge. The research in the following chapters will show that our ability to learn and retain the disciplinary knowledge described relies in large

part on our active retrieval and usage of these terms. Learning and retention of the arts of

logic, rhetoric, and grammar, and an understanding of the various stages of the writing

process can all be facilitated through the administration of frequent, low-stakes tests, and,

as I’ve argued in this chapter, the use of such tests accords well with current composition

theory and practice and can easily be adopted in today’s composition classroom.

35

CHAPTER 3: HUMAN LEARNING

The Physiology of Learning and Memory

Any conversation about effective teaching must include a consideration of how

students learn. This chapter provides an interdisciplinary overview of human learning

with an up-to-date description of how we understand learning to occur from a neurological perspective, as well as from cognitive, and social-constructivist approaches.

Knowing how the brain captures, retains, and retrieves information will help teachers to design assignments and activities that are instructionally effective. The focus of this chapter will be on those aspects of human learning that help us understand why retrieval practice (which tests facilitate) is essential for learning. Neurology studies the physical structure of the brain and nervous system, and cognitive neuroscience studies our thought processes. The brain is the center of learning and memory, so it seems obvious that educators, who are primarily concerned with learning and memory, would want to stay abreast of whatever discoveries have been made in this burgeoning field. Of course wanting to and having time to keep up with the deluge of new research that is constantly being generated are two very different things. The sheer volume of research available can be overwhelming, so this chapter will give an overview of some of this new research in an accessible format.

36

Neuronal communication is the basis for learning and memory. The brain is made up of billions of cells called neurons. Neurons can be split into two distinct parts: the cell body, and the axon. In Figure 1 On the far left is the cell body, which has multiple dendrites protruding from it. The axon is connected to the cell body at the axon hillock and ends at the axon terminals that web out into a network of up to 10,000 other neurons.

The axon terminals connect to other neurons’ dendrites. Axon terminals send out neurotransmitters and the dendrites listen for them, so the dendrites could be considered the ears of the neuron and the axon terminals the mouth.

Figure 1: “Neuron”

Source: (Sapolsky 10)

In order to communicate, neurons expend a great deal of energy redistributing ions to maintain what is called a “resting potential” and an “action potential.” To

37

communicate, neurons concentrate on contrasts in electrical activity between these states.

The differences between these two states allow neurons to communicate. This is similar to binary logic—like a light switch that is either on or off, a neuron is either communicating or not communicating. When a neuron has something to say, so to speak, it goes into action potential, sending out an electrical signal to the axon terminals, which is then relayed to the neural network.

There are two methods that neurons use to trigger an action potential: temporal summation and spatial summation ("Synaptic Transmission in the Central Nervous

System"). Temporal summation occurs when the same input is triggered over and over.

Spatial summation occurs when numerous dendrites are stimulated at once. Either method will produce enough concentration of power in the axon hillock, triggering an action potential. When action potentials are received by the dendrites, channels open and ions begin to move, causing a change in the electrical state of the neuron. When an action potential reaches the axon terminals, the neuron sends out a flood of chemical messengers called neurotransmitters. These neurotransmitters are picked up by the dendrites of the surrounding neurons and form what is called a synapse—a connection between the axon of the previous neuron and the dendrite of the receiving neuron. So long as a neuron is in action potential, it will continue to send out neurotransmitters and form synapses. If the original event that sparked the first action potential is strong enough, then a chain reaction is sustained.

38

Figure 2: “Neuronal Communication”

Source: ("The Basics: Ion Channels Underlie Neuron Communication")

The neurons which sustain these chain reactions are referred to as a neural network. A single neuron can connect to 10,000 neurons through the axon terminals, and can receive transmissions from up to 10,000 neurons through the dendrites in the cell body. Thus, neural networks are capable of an enormous degree of complexity. Neural

39

networks could be thought of as the “screenshots” of a particular moment of cognition or perception (Sapolsky 10). Patterns of neural activity are thought to correspond to particular mental states or mental representations. Under this model, learning can broadly be defined as being comprised of changes in connectivity, either via changes in potentiation at the synapse or via the strengthening or pruning of connections in a neural network. Neural networks are constantly assembled and disassembled in our brains as we learn and forget. Neural networks can be made long-lasting through repeated stimulation. Every time a neuron has an action potential, it causes a physical change in the neuron, making it more excitable in a given network, meaning that less of an excitation is required to induce later action potential. Repeated stimulation results in a neuron becoming hypersensitive, reducing the amount of excitation needed to cause an action potential, meaning that a weaker stimulation in the future will activate the associated neural network. When less excitation is required that means that it is easier to recall something. The state of the synapse becoming hyper-responsive or potentiated for long periods is called Long-Term

Potentiation (LTP). Increasing the strength of synaptic communication through LTP is the physiological basis for learning and memory as we understand it today (Sapolsky 15). Unless they undergo the physical process of LTP, neural networks will gradually dissipate. However, networks that do undergo LTP decay very little over time, and, with sufficient cues, can be retrieved many years later (“Remembering and Forgetting”).

40

Forgetting occurs in long-term memory when the formerly strengthened synaptic connections among the neurons in a neural network become weakened, or when the activation of a new network is superimposed over an older one, thus causing interference in the older memory (ibid).

Earlier I described two ways to initiate action potential: spatial and temporal summation. You’ll remember that action potential is a necessary step towards LTP, which is the process that makes stick. Considering these two forms of summation from a teacher’s perspective, spatial summation may be caused by the student’s interest, motivation, and comprehension. We can influence these variables to some degree, but they are always dependent on the student. However, activating temporal summation, on the other hand, is entirely within a teacher’s control—by administering tests. If tests are given frequently to students so that temporal summation is regularly stimulated through repetition, then LTP can be achieved. This framework from neuroscience helps us understand that test-enhanced learning can physically change the structure of the neural networks associated with course material, making it more likely that students will permanently retain course information. Test-enhanced learning can be seen as a way of providing retrieval opportunities for students in the classroom. Therefore frequently-administered low-stakes tests which are spaced out over time can be a beneficial teaching practice that will help students retain course information.

41

Cognitive Definitions of Learning and Memory

Writing in the textbook Learning and Memory: An Integrated Approach, John

Anderson explains that “Learning refers to the process of adaptation of behavior to experience, and memory refers to the permanent records that underlie this adaptation"

(6). Learning is a process, and memory is a record. Memory is divided into three types: sensory, short-term, and long-term. A dominant paradigm for describing these different types of memory is called “stage theory” and comes to us from Atkinson and Shiffrin

(1968). The relationship between these three stages is shown in the flow chart below, which begins with an external stimulus activating . There it is either forgotten, or it goes through initial processing and enters short-term memory. With repetition it will stay in short term memory and with elaboration and coding it will enter long-term memory—otherwise it will be forgotten. After a short duration that memory will leave STM and can only be brought back from LTM with retrieval. The image also shows that our response to a situation can only be aided by information in our short-term memory—it cannot jump immediately from LTM into current use.

42

Figure 3: “Stage Theory of Memory”

Source: (“Three Stages of Memory”)

The following analogy will help illustrate these three types of memory. Imagine

three types of writing: one is drawn directly on the surface of water, the next is written on

wet sand at the beach, and the third is chiseled into the stone of a mountain. When we draw directly on water, the surface will immediately change and erase whatever we drew.

The message will disappear in a moment. This is similar to sensory memory. We take in vast amounts of information every moment through our senses, and a great deal of it we don’t pay attention to after the moment passes—sensory memory is soon forgotten, so these “recordings” have no retrieval strength and no strength. Short-term memory is like drawing in the wet sand at the beach. The record is clear and accessible but it will fade over time when the tide washes it away—it has high retrieval strength, but low

43

storage strength. Finally, long-term memories that have undergone significant and retrieval are deeply etched in our mind, like words chiseled in stone.

Bjork and Bjork (1992) distinguish two qualities of long-term memory which together determine the likelihood of successful retrieval (e.g. answering a test question or remembering relevant information when cued): (1) retrieval strength and (2) storage strength. Retrieval strength is the accessibility of a memory at a given moment. Storage strength is how deeply a memory is embedded in the mind. Imagine you just learned a stranger’s name. If you clearly heard it and repeated it, then at that moment it is very fresh in your mind, so it has high retrieval strength. However, if you do not meet that person again or have a chance to retrieve their name again, then it is likely that you will not remember their name because the storage strength is weak. On the other hand, consider the name of a close relative who died long ago. As time passes, it is likely you will not think about them as frequently, and so their name will have low retrieval strength. However, their name is deeply embedded and you are unlikely to forget it even though you have not used it recently because it has high storage strength.

As described earlier with the physiological description of memory, the storage strength of memories is improved through repeated retrieval actions, which bring about

LTP. The retrieval strength of memories is contingent on the retrieval cues that bring that knowledge to the forefront (such as being asked the question on a test, or a sensory experience that triggers remembering). A third quality we have not yet considered is the

44

encoding strength of memories, which is best understood through schema theory.

The term “schema” was first used by Piaget in 1926. R. C. Anderson, a respected educational psychologist, expanded the meaning and developed schema theory. This learning theory views organized knowledge as an elaborate network of abstract mental structures which represent one’s understanding of the world. Schemata are prior knowledge linkages, and they influence the amount and proficiency of our learning.

Schemata can be added to, and, as an individual gains experience, schemata develop to include more variables and more specificity. Each schema is embedded in other schemata and itself contains subschema. Schemata change moment by moment as information is received. They may also be reorganized when incoming data reveals a need to restructure the concept. Schema theory shows us that the encoding strength of memories should be increased when meaningful connections are made between various schemata, and when information is retrieved in new situations and transferred to new circumstances.

How students organize knowledge influences how they learn and apply what they know. As humans, when we are paying attention, we naturally make connections between new knowledge and existing schema in our minds. When those connections form knowledge structures that are accurately and meaningfully organized, we are better able to retrieve and apply that knowledge effectively and efficiently. In contrast, when knowledge is connected in inaccurate or random ways, we can fail to retrieve or apply it appropriately. In this way, our prior knowledge can help or hinder learning (Ambrose, et

45

al. 4). Teachers work to help students create meaningful connections and connect new knowledge with prior learning; teachers also strive to ensure that students create accurate and efficient schema, interconnecting disciplinary content and writing practice. Teachers can enhance these practices and can help structure student learning through tests.

Writing in How Learning Works: Seven Research-Based Principles for Smart

Teaching, Ambrose et al. define learning as the process that leads to change resulting from experience and increased potential performance for future learning (3). Learning is the result of how students interpret and respond to their experiences, and therefore learning can bring about changes in knowledge, behaviors, beliefs, and attitudes. Herbert

Simon, Nobel Laureate and one of the founders of the field of cognitive science, argues that “Learning results from what the student does and thinks and only from what the student does and thinks. The teacher can advance learning only by influencing what the student does to learn” (qtd in Ambrose et al. 3). Learning is a process, not a product, and because it happens within each student, instructors can only infer that learning has taken place from students’ products or performance—learning is not something that an instructor can do to students, but rather it is a process that students themselves do. David Ausubel (1968) coined the term “.” In Ausubel’s view, to learn meaningfully, students must relate new knowledge (concepts and propositions) to what they already know. Under this model new knowledge must be internalized in relation to what is already understood. Undesirable learning, on the other hand, consists

46

of repetition of an item without full understanding of its meaning or how it connects to other knowledge. According to Ausubel, when meaningful learning occurs, disparate facts are understood in relation to each other and therefore recollection of any single fact will prime the mind for recollection of the related facts. This is similar to schema theory described earlier. In practical terms in the classroom, meaningful learning occurs when learners construct their knowledge in their own words. It requires that teachers give students the opportunity to engage in personally meaningful written and verbal expression. Simple recollection without real comprehension, also referred to as “parroting,” is an example of undesirable learning. For example, with enough cramming any person could memorize the questions and answers to a test administered in a language he or she does not understand. This type of mimicry does not indicate legitimate understanding and internalization, because if the questions were re-phrased, or given in a different order, or the parameters of the test changed in any other way, then regardless of the amount of cramming one has done, he or she wouldn’t be able to complete the test. This is because that person never actually understood the questions or answers—rather he or she had just learned to provide a particular response to a specific stimulus. True learning, on the other hand, is demonstrated by the usage of learned material in new and meaningful contexts.

This model of meaningful learning is helpful for our purposes because it provides a framework for how instructors should model their teaching. It is easy to think of tests as

47

simple things that encourage rote memorization out of context; however, with proper

care, teachers can make tests that are meaningful and relevant to course content and

present them in ways to students that encourage meaningful learning. Concerns that

testing merely requires parroting can be addressed by designing tests that require more

depth than just recognition, like short answer or essay responses. Test-enhanced learning

will ensure students have a grasp of disciplinary content; this will set the foundation for

meaningful learning as students will be able to competently use course content during

group engagement, and during the writing process.

Three Theories of Learning

Three theories of learning will further help us understand the learning process and

how testing can be a beneficial practice: the encoding-specificity principle, transfer-

appropriate processing, and desirable difficulty. The encoding-specificity principle holds that a retrieval cue (i.e. an external stimulus that induces retrieval) will be effective if it overlaps with features in the original memory trace. The encoding-specificity principle theorizes that the best memory performance is generally found when the processes engaged and cues given at retrieval are similar to those engaged in during encoding (see

Fisher & Craik 1977 and Moscovitch & Craik 1976). The following familiar story illustrates the principle: you may be unable to remember the name of a neighbor’s dog

48

until the moment when you are watching a television program about show-dogs, which reminds you that your neighbor’s dog is a show-dog named Tess. In other words, we cannot always predict what type of external stimulus may spark retrieval in the future. As another example, it may be more difficult to remember a classmate or a co-worker’s face if you see them in an unfamiliar context like walking on the street or at a park. However, once you meet this person a few times in different settings, it is more likely you will remember them. Where and how information is encoded affects our ability to retrieve it. The implication of the encoding-specificity principle is that encoding variability

(i.e. encoding under numerous circumstances) should produce better retention because it increases the number of potential retrieval routes, thereby increasing the probability of a match with whatever cue is presented at retrieval. According to the theory, encoding variability of any sort should also increase the probability of successful retrieval. As an example, I know a professor who brings citrus to class on test days for the aroma to help put students at ease. The aroma should also increase encoding variability for the students and hypothetically create a connection between the smell of citrus and the content of that class. It could be years later for one of her students that the smell of citrus triggers memories of that exam and the information on it. For the purposes of analogy, let us consider the mind as a labyrinth, with successful retrieval representing successfully navigating the labyrinth from beginning to end. Every memory trace enters and leaves the mind through a vast network, like a labyrinth. Imagine that as it does so it leaves a mark

49

behind on the walls, and the mark gets clearer every time the same route is followed;

eventually a clear path is laid through the labyrinth. This “mark” was described

physiologically as long-term potentiation (LTP). To follow the analogy, you could be

dropped at any point in the vast labyrinth, and by establishing multiple routes that all

converge on the same memory, or connected schema, the likelihood of successful

retrieval will be increased. This is in line with the practice of frequently administering

low-stakes tests. Repeated tests which use different questions and test on different parts

of the question stem should promote encoding variability, because every time students

take a test there will be differences in mood, activated schema, and a variety of other

external factors. This variety of encoding circumstances, when compounded by the

variation in tests questions, will produce multiple retrieval routes, which should improve

memory performance.

Transfer-appropriate processing theorizes that successful retrieval is dependent on the overlap between the cognition engaged in during encoding, and the cognition engaged in during retrieval (see Kolers & Roediger 1984; Morris, Bransford, & Franks 1977;

Roediger 1990; Roediger et al. 2002). In terms of students learning on tests, Roediger and

Karpicke (2006) argue that retrieval practice through active testing is more effective than passive review because the cognition engaged in during testing more closely matches the necessary cognition for later retrieval than passive learning does. For example, if someone wanted to learn how to swim, they could conceivably do so without ever getting

50

in the water by suspending the body with ropes and practicing the arm strokes and foot motions. This practice and the act of swimming are transfer-appropriate to some degree; however, the training would be more effective if it took place while actually swimming.

In the same way, listening to a lecture on writing effective transitions may prepare some students to apply that material to their own writing, but the best way to train for long- term retention, and personal reproduction of course material is through active reproduction on tests. Performance on a test requires similar cognition as remembering that same information in a different context such as in conversation or while writing.

Because the two actions are transfer-appropriate, testing can be seen as a more effective training method for later retrieval than learning through lecture. In the case of composition instruction, the material that students learn through the course will be more readily understood and internalized if there is a strongly embedded schema of related terms and concepts. Frequently-administered tests provide retrieval opportunities which increase storage strength, and also improve encoding strength. When students retrieve information as the course progresses, it allows them to relate old information to the new schemata that they are developing through internalization of the new course content. One theory of learning holds that more challenging retrieval produces greater benefits for long-term retention; Bjork (1992) refers to this principle as “desirable difficulty” (see Bjork 1999; Karpicke & Roediger 2007; McDaniel, et al. 2007; Roediger

51

& Karpicke 2006. This theory holds that when retrieval strength is high and information

is easily accessible, the retrieval of that information produces small gains in storage

strength. In contrast, more difficult retrieval actions, such as remembering in a different

environment, or with fewer cues, or after a long period of time, all produce greater

increments in storage strength. To follow the analogy used earlier with the three types of

writing—imagine that time passes in the mountains of our minds and the stones we

chiseled words into are covered by moss and leaves. The words are still there etched in

stone, but they are difficult to access—they have high storage strength but low retrieval

strength. Desirable difficulty theorizes that the greatest gains in storage strength will be

made when retrieval strength is low. By analogy, that means that every time you clear off

the moss and leaves and retrieve the words, they also become more deeply etched in the

stone beneath. The theory of desirable difficulty compels teachers to form tests which require

active reproduction from students. The predictions of desirable difficulty are confirmed

by differing rates of long-term retention as a result of taking recognition or recall tests. In a recognition test like multiple choice, the right answer is presented among others, which provides a strong cue for recollection and greatly increases the likelihood of giving a correct answer. Alternatively, short-answer questions require the taker to provide the answer in their own words, which requires more effort. This means it typically takes longer to answer a short answer question, and it means that the taker has to produce the

52

answer without having the cue for recollection. Although both tests benefit memory, the

more effortful recall produces better long-term retention. Many studies show that recall

tests promote better long-term retention than recognition tests (see Jacoby 1978; Butler &

Roediger 2007; Glover 1989; Kang, McDermott, & Roediger 2007; McDaniel, Anderson,

Derbish, & Morrisette 2007). Further support for desirable difficulty comes from

Agarwal et al. (2008) who studied student learning comparing open-book and closed-

book tests. Closed-book tests require more difficult, challenging processing than

restudying a passage, yet difficult processing benefits long-term retention according to

the theory of desirable difficulty. The study found that open-book tests increase retrieval

strength, as evidenced by high initial performance, but produce small increments in

storage strength. In contrast the more difficult closed-book tests produced greater long-

term retention. Conditions that require more difficult and challenging processing may

slow initial learning but ultimately enhance long-term retention relative to less-

challenging learning conditions that produce rapid initial learning but poor retention.

These three theories of learning help us understand the testing effect and converge

on the notion that we should administer frequent low-stakes tests. In regards to

composition instruction, we can achieve desirable difficulty through administering short-

answer questions and essay questions. Short-answer questions are perfect to ensure students are retaining key course concepts that will be useful in peer collaboration, class discussion, and the writing process.

53

Social Constructivist Theories of Learning and Testing

One way to make learning more meaningful and memorable is with the use of

social constructivist theories of learning. This section will explain learning from a socio-

cultural perspective and further emphasize the importance of collaborative conversation

in the classroom, and tests will be discussed as a tool for helping students join that

conversation and the academic discourse community. While behaviorists theorize

learning as a series of stimulus-response pairs, and cognitivists theorize behavior as a

complex formula resulting from each individual’s cognition, social constructivists expand

upon these models by focusing on interaction between groups rather than just focusing on

individual behavior. This theory dictates that language is at its core a social act, and in

order to communicate with various “discourse communities,” we adopt jargon and

communicative patterns that are appropriate to the community we are a part of (Lee 2). In

order to be a member of any discourse community—be it academic, professional, or personal—the individual must learn the conventions and jargon of that particular community; thus any instructional tool—like tests—that can help students join the academic discourse community will be of benefit..

Social constructivism is distinguished by the belief that language and the mind are inseparable, because any individual needs language in order to think and encode their long-term memories. Language, in Vygotskian terms, is a psychological tool that humans utilize uniquely among all other animals. Further corroborating the key position of

54

language to development, linguist Michael Halliday has argued that “language is the essential condition of knowing, the process by which experience becomes knowledge”

(57). It is language, and other psychological tools (such as mathematical symbols, the alphabet, and scientific diagrams), that allow humans to perform the unique activities that we do—from building rockets to writing sonnets.

In his work The Dialogic Imagination (English translation 1986), Mikhail Bakhtin describes a socio-cultural model in which the individual and society interact to influence personal development. Every individual exists in a society with a history of complex interaction between language and power. From our youth we are influenced by social discourses, and by our parents and role models, for example, whose views we interact with and selectively internalize. Through our interaction with society, Bakhtin maintains that “not only are the meanings of words and expressions ‘borrowed’ from the speech of others, but each utterance is a link in a very complexly organized chain of other utterances” (337). Within this “chain of utterances,” every individual is involved in a dynamic process of self-discovery and creation of identity in relation to their larger discourse community. Bakhtin describes this social enculturation as “ideological becoming,”—“the process of selectively assimilating the words of others” (341). This process of ideological becoming involves redefining the words of others into our own

“internally persuasive discourse.” In Bakhtin’s words: “internally persuasive discourse, as opposed to one that is externally authoritarian is…tightly interwoven with ‘one’s own

55

word’” (346). Bakhtin explains how these dialogic interrelations preexist and shape each individual utterance: "The living utterance, having taken meaning and shape at a particular historical moment in a socially specific environment, cannot fail to brush up against thousands of living dialogic threads, woven by socio-ideological consciousness around the given object of an utterance; it cannot fail to become an active participant in social dialogue" (276). Through our interaction with the external social world, we mediate outside influences through language into ideologies that are personally meaningful.

One model that has been useful for describing collaborative learning is

Vygotsky’s Zone of Proximal Development, or ZPD. While studying children’s learning,

Vygotsky found that each child had an achievement potential that they could realize unassisted, and one that was higher if they were aided by someone more knowledgeable

(Vygotsky 84). He calls the space in which this learning happens the Zone of Proximal

Development. Wells, writing in Vygotskian Perspectives, describes the ZPD as the “use of language between novices and more expert others as a tool for mediating misconceptions and consolidating understandings" (5). He argues that rather than only viewing the ZPD as existing between a single expert and a less knowledgeable peer, we should also consider the ZPD as a collaborative model between a group of peers.

According to Wells, effective learning is not unidirectional (as is assumed by a lecture model), but rather understanding is both mutually constructed and reciprocal. By

56

grouping students with their peers, the ZPD is changed from a unidirectional exchange into a multi-dimensional one, where every student has something to contribute, and learning happens as a group. Through collaboration, concepts that were only vaguely understood before can coalesce into coherent thoughts; individuals can work to an agreement on core meaning, and develop their own informed opinions through critical engagement with class material.

In composition and creative writing, if we follow the idea of ideological becoming, then students, in part, form themselves and their intellectual development through what they write, and through discussing their writing. This process of ideological becoming through learning and writing is one of the very reasons that we value higher education. As educators we seek to guide this process of subject formation and self- expression by teaching the disciplinary content that we argue will enhance writing abilities. Based on social constructivist theory, it stands to reason that our students will be more capable peers (a term coined by Vygotsky) if they know the vocabulary and conventions of the academic discourse community. Because composition teachers don’t use lectures as a primary means of instruction, but instead use student-directed methods and group work, it means that students need to have done the reading and internalized the course material so they have the vocabulary and knowledge to engage as capable peers.

One way we can ensure students have this level of understanding is by testing for content.

This will lay the foundation for students to engage in group work without the teacher.

57

The practice of administering frequent, low-stakes tests as a method of teaching is well in line with the theoretical stance of social constructivism. With the rise of social constructivist theory as a guiding paradigm in composition, there should also come a recognition that tests can be used to help students gain fluency in the academic discourse community. Bringing students into the field of academic discourse requires that they learn certain disciplinary knowledge. In order to participate effectively in this discourse community, students must internalize genre-specific conventions, communication styles such as Standard American English, and research guidelines such as MLA or APA.

Testing can help students to do this effectively. This is especially true for first-year composition students who tend to be new to the discourse community of academia. The conventions of academia will gradually be internalized by most students, but their integration can be assisted via testing.

Some people view testing as a prescriptive method of instruction that stifles creativity. They might argue argue that testing encourages convergent thinking— imparting on the student that there is only one right answer which comes from an authority. Similarly, many would question the purpose of memorizing information in our digital age when answers are a mere Google search away. What is the significance of remembering? Some would argue that we don’t need to memorize details; after all, that’s what computers are for. They might argue that what really matters is how things fit together. However, Robert Bjork, a prominent researcher in the field of memory asks “the

58

people who criticize memorization—how happy would they be to spell out every letter of every word they read?” (qtd in Wolf’s “Want to Remember Everything You’ll Ever

Learn?”). It is an inescapable fact that to participate in new fields we must learn new things. For example, children can only learn to read whole words through dedicated practice. Every time we enter a new field we have to go through the same process—we become children again. Every field has its own language and conventions that must be internalized. The process of learning requires repetition and verbal encoding of new concepts into one’s own words, and testing can help with this process.

But let’s return to the question of creativity. Creativity is the ability to view the world through a variety of models (or paradigms) and since testing can increase the amount of memorized information we have to work with, it can enhance rather than detract from the number of ways we can apply what we know. Creativity is not the opposite of memorization; it is the useful application of memorized information. The human brain is a marvel of associative processing, but in order to make associations, data must be loaded into memory. We need to internalize disciplinary information through encoding and repetition in order to utilize it creatively. The goal of the composition classroom should be to facilitate both learning and memorization of content, and the application of that knowledge towards creative goals. However, my research leads me to believe that currently there is a lack of emphasis on memorization in composition instruction.

59

In both Bakhtinian and Vygotskian theory, language is at the heart of development, and the social world is the arena where language is exercised and developed. Between the individual and the group, new ground can be broken. When engaged in collaboration, the exchange of multiple interlocutors improves understanding and retention of class material through retrieval and repetition. Additionally, dynamic collaborative dialogue can lead to the co-creation of new meaning—a creation that would not have been possible individually. John-Steiner and Meehan put it this way:

“knowledge therefore is both re-constructed and co-constructed in the course of dialogic interaction… [members in a dialogue] actively restructure their knowledge both with each other and within themselves” (35). The separation of memorization and creativity into distinct categories represents a false dichotomy. Creativity is the application of what you know to new effect. Because tests enhance knowledge, they can also be used to enhance creativity.

This chapter has given an overview of human learning from the theoretical perspectives of brain physiology, cognitivism, and social constructivism. We have seen that we can increase encoding strength of long-term memories through meaningful learning and by connecting the things we learn with our pre-existing schemata. Also important is the role of collaboration and conversation. Though some may argue that creativity springs from an independent emergence in the mind of a genius, I have argued that creativity is enhanced through internalization of disciplinary content, and emerges

60

from collaborative dialogue between groups of capable peers. Teachers can help ensure

that students have internalized the course material necessary to engage as a group of

capable peers through giving frequently-administered low-stakes tests. An exclusive emphasis on testing cannot fulfill all of the needs of composition. I argue that tests are an essential but insufficient method of instruction. Our current practice in composition instruction, which includes a variety of group activities and collaborative discussions, is also a necessary, but insufficient method of instruction. By combining the two methods we can both ensure that students are adequately internalizing course content and creatively using it to new effect.

61

CHAPTER 4: THE TESTING EFFECT AND THE SPACING EFFECT

What is the Testing Effect?

Although many people associate tests with the assessment or measuring of

knowledge rather than with learning, research shows that a test can serve a far greater

purpose than mere assessment. The studies discussed in this chapter show that tests can

also enhance learning and improve long-term retention, and they show that the act of retrieving information from memory on tests increases the probability of successful retrieval in the future. This phenomenon of enhanced learning as a result of testing has come to be known as “the testing effect.” H. L. Roediger, a writes that

The testing effect represents a conundrum, a small version of the Heisenberg uncertainty principle in psychology: Just as measuring the position of an electron changes that position, so the act of retrieving information from memory changes the representation underlying retrieval, and enhances later retention of the tested information (“The Power of Testing Memory” 182).

This is in line with the physiological description of memory through neural networks given earlier. Every act of retrieval brings about a physical and structural change to the network, which results in long-term potentiation (LTP), making future retrieval actions easier.

The testing effect has considerable implications for composition instruction and the field of education at large. As discussed in chapter 2, our current practice in

62

composition instruction uses collaborative, student-directed methods of learning.

However, in order for students to effectively collaborate, they need to have adequately

internalized the disciplinary content of composition, including the conventions of

Standard Academic English, research conventions like those put forth by the MLA, and a

wide variety of terms and definitions (“thesis”, “topic sentence”, “analysis”, etc.). What the testing effect and the spacing effect show us is that, without retrieval opportunities spaced throughout the semester, it is unlikely that students will retain this content.

Therefore, this project argues for the use of frequently-spaced low-stakes tests to help students internalize the knowledge of composition—as argued in Chapter 2, this will make students more capable of group work and gradually improve their writing.

A brief review of this phenomenon and the contexts in which it has been found is given by Mark McDaniel and colleagues in the article “Testing the Testing Effect in the

Classroom”: Testing effects are observed with word lists (Hogan & Kintsch, 1971; McDaniel & Masson, 1985), paired associate lists (Allen, Mahler, & Estes, 1969; Carrier & Pashler, 1992), pictures (Wheeler & Roediger, 1992), and prose material (Glover, 1989; Roediger & Karpicke, 2006b). Testing effects surface when the intervening tests are different from the final tests: intervening recall tests improve subsequent recognition (Glover, 1989; Lockhart, 1975; Wenger, Thompson & Bartling, 1980) and intervening recognition tests improve subsequent recall (Runquist, 1983). Taking a test is almost always a more potent learning device than additional study of the target material (see Carrier & Pashler, 1992, for recent experimental tests, and Roediger & Karpicke, 2006a, for a review). (495)

63

Recent studies have examined the testing effect in middle school (McDaniel 2007, 2011) and college (Butler 2012) contexts. As we will see in this chapter, the testing effect is consistently found in diverse studies. The literature reviewed in this section shows that testing reduces forgetting, especially if administered shortly after learning, and multiple tests produce a greater effect in slowing forgetting than a single test. The studies reviewed in this chapter also show that taking a test has a greater positive effect on future retention than spending an equivalent amount of time restudying the material.

Many of the studies reviewed below test learning and retention of paired associates; these are A-B connections, and when presented with the cue of A or B, the test taker would have to recall its associate. Paired-associates can represent diverse information, such names to go with faces, or a phone number for a friend, or translations of words from L1 to L2, or that 8x9 = 72. The difficulty of pair-associates is in part dictated by how logically they associate. For example, “chair-table” is easier to remember than “chair-donkey,” which is in turn easier than “VFU-734.” At its core, this form of memorization, or paired-associate learning, is identical to the memorization of the key vocabulary terms and definitions that might be required to successfully function and collaborate in a composition class. Any method that increases long-term retention of paired associates would be a beneficial instructional technique in a composition class.

While these studies are not measuring skill formation (i.e. development of writing ability over the course of a semester), they are examining the ability to recall key memorized

64

information similar to the key concepts and terms taught in composition.

Studies Reporting a Testing Effect

The first large scale study of the testing effect was conducted by Arthur Gates and published in 1917. This study compared the effectiveness of active recall (what they call

“reciting”) to passive review, and found that the active recall required by testing improved retention of the concepts over study. Gates tested children in grades 1, 3, 4, 5,

6, and 8, using two types of materials, nonsense syllables and facts taken from prose passages in the book Who’s Who in America. The nonsense syllables were simply three letter groupings that do not form a word in English such as DAK, YRK, or CTR. The children studied the materials in two phases, first reading to themselves, then looking away from the materials and recalling (reciting) whatever answers they could, with researchers recording the students’ performance during free-recall. Researchers instructed students to read or recite for different amounts of time, and different groups of children, separated by age level, spent 20, 40, 60, 80, or 90% of the time self-testing. At the end of the period Gates administered a test to the children on the material they had learned, and after a delay of 3 to 4 hours he retested them.

Presumably because of their early level of cognitive development, first graders, children six to seven years old, were not able to perform very well in the study and they

65

were not tested on the prose passage because of poor reading abilities. Their performance alongside the other students’ on nonsense syllables can be seen in Figure 4, which shows the proportion of test items recalled on the X-axis and proportion of time reciting in the

Y-axis. The increase in performance can be more clearly seen on the delayed test rather than the immediate. The top two graphs are for performance on immediate tests (left) on nonsense syllables and biographical facts (right). The bottom two graphs are for performance on delayed tests on the same subjects and positions as the top two. With the prose passages, the optimal amount of recitation seemed to be about 60% of the total learning period, with the rest spent re-reading. Researchers found that the effect leveled off and test scores began to drop at higher rates of recitation to re-reading. The benefits of recitation do not level out for nonsense syllables, because they are nonsensical and reading would not help encode the information. This data suggests that a balance between studying and testing is best.

66

Figure 4: “Performance on Immediate and Delayed Tests”

Source: (“The Power of Testing Memory” 184)

A second landmark study showing positive effects of testing was carried out by

H.F. Spitzer (1939). Spitzer's study demonstrated not only that testing improved retention, but that a shorter delay between initial learning and testing is of greater benefit than a longer interval between studying and testing. Spitzer and colleagues conducted a large-scale experiment involving the entire population of sixth-grade students in 91 elementary schools in nine Iowa cities, for a total of 3,605 subjects. Students studied one of two 600-word articles containing information about either peanuts or bamboo. The

67

students were then split into eight groups and given a 25-item multiple-choice test on the

material over the course of the next 63 days—each group tested with a different retention interval.

Spitzer also manipulated the number of tests taken by different groups, and the delay between studying and testing. After studying the passage, each of the eight groups of subjects was given one, two, or three tests on various schedules across the next 63 days. Some students took a single test 63 days after initial exposure to the material, while others took earlier tests. Group 1 and Group 2 took an immediate test. All other groups took their first test after a delay of days or weeks. For example, Group 6 did not take an initial test until day 21. Figure 5 shows the proportion correct on multiple-choice tests taken at various delays. The solid lines show results for repeated tests for particular groups. The dashed line (a visual aid connecting each group by the day of their initial test) shows that the longer the first test was delayed, the worse was the students’ performance on that test. In all cases, giving a test at some point either slowed, or stopped forgetting. Groups 4, 5, and 6 all showed an increase in proportion correct after their first test. The students who took a test sooner after learning the material demonstrated much greater recollection than the students in those groups who took their first test after a longer interval. By day 21, forgetting had already reached its peak, and Groups 6, 7, and

8 all show similar performance to each other.

68

Figure 5: “Testing Schedule Shows a Forgetting Curve”

Source: “The Power of Testing Memory” 185

This figure shows that the longer the interval between initial exposure and the first test, the worse the subjects’ performance was. The students who took a test shortly after learning the material demonstrated much greater recollection than the students in those groups who took their first test after a longer retention interval. Group 2, which had the best recollection after a 63 day delay, was tested immediately after study, and then tested again on the same material one week later. This study reveals the importance of spacing tests out in order to improve retention of material. The landmark studies of Gates and Spitzer together seems to suggest we administer a series of low-stakes tests throughout the semester, thereby requiring students to actively recall key information a number of times with increasingly spaced intervals between subsequent tests. This notion

69

of an expanded test schedule will be described in more detail in the next section, “The

Spacing Effect.” While the Gates and Spitzer studies provide support for the testing and spacing effect, they were performed with elementary school students, a different demographic than we find in composition classes at the university level. Other studies, however, have worked with undergraduates. , for example, examined the recall ability of undergraduates at the University of Toronto. Tulving (1967) had subjects, three groups of

18 students, learn a list of 36 nouns presented in a random order each study trial. The purpose of this study was compare retention between groups with various studying and testing schedules. If S stands for a study trial, and T stands for a test trial, the three groups were compared in the following manner: Group 1 went through a process of

STST, Group 2 followed a process of STTT, and Group 3 did SSST. During a study session, subjects looked at the word list and tried to memorize it, and for the test condition subjects verbally free-recalled as many items as they could in any order, which the experimenter recorded (see Figure 6).

70

Figure 6: “Study-Test-Study-Test (STST) Most Effective Learning Strategy”

Source: “The Power of Testing Memory” 185

Tulving showed that testing and studying can produce the same amount of learning; however the subjects were tested with an immediate retention interval (tested directly after studying). Later research shows that if long-term retention is measured after a delay, repeated testing actually produces better recall than the repeated studying. For example, Karpicke and Roediger (2006) replicated Tulving’s basic result that learning curves for the three conditions are similar. However, unlike Tulving, they repeated the test after a 1-week delay. Their results can be seen on the right side of Figure 6. In the

2006 study, subjects returned one week later, and were given 10 minutes to recall as many words as they could. Their performance was recorded at the end of each minute.

71

The comparison between the three conditions reveals a positive benefit for the STST

learning condition when long-term retention is the studied variable. Despite the fact that

the subjects who studied repeatedly had studied the words 15 times a week earlier and

those who were tested repeatedly had only studied them 5 times, the recall was greater for

the STTT condition than the SSST group. These results show that, in this study at least, a

balanced mixture of studying and testing is the best method to ensure long-term retention.

Again, we find a study supporting the notion of incorporating frequent tests into the

semester. As instructors we can reproduce the STST condition for our students by having

a number of tests in the class spaced throughout the semester so students are engaged in

the process of repetitive studying and testing. Roediger and Karpicke (2006b) are not the only researchers to show a clear

benefit for testing over studying as the retention interval increases. Thompson, Wenger, and Bartling (1978) further confirm these findings. However, they demonstrate that selectively studying only the material missed on a previous test is more efficient than restudying all of the material in general. This study also used 40-word lists, but used

different learning conditions, a four-study trial (SSSS), a three-test trial (STTT), and a condition which personalized the study schedule for each subject (STrTrTr). The study also included a final test 5 minutes after the learning phase and again 2 days later.

Subjects in the third (STrTrTr) condition studied the word list once, recalled it, studied only those words they failed to recall, and then recalled the entire list again, and so on for

72

three more study-test episodes, with the study lists becoming shorter as they performed better on the tests. Though each study session was personalized, during each test the subjects in this group recalled the entire list of items of each test trial, not just the items they had restudied. The results of Thompson et al. (1978) are printed below in Figure 7. For both the five-minute and the two-day retention intervals, the group with selective restudying performed best. At a five minute interval, the SSSS group scored 50%, but fared far worse after a 48-hour delay, scoring only 22%. The STTT group scored less (28%) on the initial test than both other groups. However, the STTT group also showed very little forgetting and 48 hours later the group scored 22% on the test. With a retention interval of 5 minutes, the STTT group had the poorest performance, but with a retention interval of 2 days, the SSSS group had the poorest performance. The percentage forgetting was calculated as follows: [(recall at 5 min – recall at 48 hours)/recall at 5 min] x 100.

Applying this formula to each group (see Figure 7) will show that the repeated study condition resulted in much greater forgetting as time passed. In line with the other studies in this chapter, these results show that massed-study helps immediate recall, but performance declines as the retention interval increases. In addition this study shows that selective restudying and repeated testing is the most effective combination for ensuring long-term retention. Once information has been learned and successfully recalled on a test, it is best for students to spend their time studying the material that they failed to

73

recall. These findings further support incorporating a number of low-stakes tests

throughout the semester in our composition classes. Additionally this study shows that

the most efficient means of learning seems to be to personalize your study based on those

answers you missed on the previous test, but to continue testing on all items every time.

As composition instructors, we can best help our students learn by administering frequent

tests and directing each student to personalize his or her study, and restudy what they

missed on the last test.

Figure 7: “Proportion Correct in Immediate and Delayed Recall”

Source: “The Power of Testing Memory” 187

The findings that massed study improves initial recall, but loses effectiveness as the retention interval increases found by Thompson et al. were replicated by Wheeler,

Ewers, and Buonanno (2003). In their study, subjects studied a 40-word list in a repeated study condition (SSSSS) or with one study session followed by four consecutive recall tests (STTTT). Consistent with previous research, in a final free-recall test given 5 minutes later or 1 week later, the researchers found an advantage for massed study on an immediate test, but the massed-study group performed poorly with a retention interval of

74

one week (see Figure 8). Comparatively the subjects in the study-only condition were re-

exposed to the material 5 times more than repeated-test condition subjects, who were

only re-exposed to those words that they were able to recall after only one study session

(about 11 out of 50 words in the experiment). Figure 8 shows the proportion of words

recalled on immediate (5-min) and delayed (7-day) tests after repeated studying or

repeated testing. The repeated test condition produced better retention than the repeated

study condition. What is most noteworthy about this is the comparative decrease in

forgetting between the two groups. Though the repeated study group performed better

initially, they had a much greater rate of forgetting, and after a 7-day delay, had forgotten most of what they had learned. These results confirm the power of testing for long-term retention.

Figure 8: “Word Recall on Immediate and Delayed Tests”

Source: “The Power of Testing Memory” 188

75

Though there has been a great deal of research on the testing effect in a laboratory

setting, in these studies testing intervals and the amount of time and conditions of testing

are carefully controlled or manipulated. Contrary to the lab, in a class there is great

variability between students’ retention intervals and across students’ study time and

effort. In the laboratory, long retention intervals are typically 1 or 2 days (e.g. Carrier &

Pashler 1992; Hogan & Kintsch 1971; Masson & McDaniel 1981; McDaniel & Masson

1985); in much research, the intervals are on the order of minutes or hours (e.g. Bartlett

1977). However, in a class, the delays between quizzes and exams are typically weeks or

months. According to the article “Testing the Testing Effect in the Classroom,” as of

2007, very few experiments had studied the testing effect at 1-week or longer intervals.

McDaniel cites Roediger & Karpicke 2006, and Wheeler, Ewers, & Buonanno 2003, for

1-week delay, and Butler & Roediger 2007 for a 1-month delay; additionally, Spitzer

(1939) tested students at an interval of 63 days. In their 2007 study, McDaniel and colleagues tried to create experimental conditions which would test the applicability of the testing effect in a practical setting outside of a laboratory. This study was conducted during six weeks of a web-based Brain and Behavior course at the University of New Mexico, with 35 participants. Each week all students in the class were assigned approximately 40 pages of textbook reading in the course. All participants completed weekly quizzes, two unit exams, and a final exam that

76

were constructed for the experiment. Weekly quizzes included 10 items that were generated from the content of the reading. Each week, participants received their “quiz” in a different test format (multiple choice (MC), short answer (SA), or read only (RO)).

On the week when the participants received the RO condition, they simply read facts and clicked a button marked ‘‘I have read the above statement.’’ Participants were allowed 10 minutes to complete each quiz; immediately after finishing they were provided access to feedback. Because the quizzes were online, whether or not the students used this feedback was dependent on their own volition. After 3 weeks of quizzes (one of each format) participants were instructed to take the first unit test, with all participants asked the same questions. The same method was repeated for the following three weeks.

Several weeks after completing the second unit test, participants were instructed to take the final cumulative exam, which combined material from units one and two.

Students’ performance on the unit exams is compared and summarized below in Figure 9, which shows performance of quizzed versus not-quizzed items collapsed across units. These results show that testing, but not additional reading, improved performance on the unit exams for the material which was targeted during previous tests.

77

Figure 9: “Student Performance Averaged across Unit Exams”

Source: (“Testing the Testing Effect” 508)

These findings demonstrate that tests enhance learning and retention even in the face of the variable conditions found in a college course setting. This experiment and those experiments conducted in social studies classes (Roediger, et al. 2010) are, according to the authors, the first to show the effectiveness of low-stakes quizzing in promoting retention of course content on summative assessments used in actual classrooms. The present research shows that the benefits of the testing effect can clearly transfer to the classroom.

A common concern of testing is whether students are learning complete conceptual relation of facts or whether they are parroting a particular answer to a particular question. To address this concern and assess if students had a deeper understanding of the tests, questions from the course readings alternated between the

78

weekly quizzes and unit tests so that an alternative portion of the fact was required for the

answer. In the present study, short-answer and multiple-choice quizzes improved performance more than recognition quizzes did on a subsequent test in which the retrieval cues had been altered (i.e. a different question stem was provided than during previous tests). The increase in performance connected with the different wordings which this study reports is in line with the theory of desirable difficulty, which posits that effortful retrieval is more beneficial towards long-term retention, because recognition is a less demanding retrieval task than recall.

Another issue addressed in the basic memory literature is the relative benefit of cued recall tests (e.g. short answer, essay) over recognition (e.g. multiple choice) tests.

McDaniel (2007) writes

Studies with simple laboratory materials (word or paired associate lists) have found that retrieval through recall benefits subsequent test performance more so than retrieval processes associated with recognition (Cooper & Monk, 1976; Darley & Murdock, 1971; Mandler & Rabinowitz, 1981; McDaniel & Masson, 1985; Wenger, Thompson, & Bartling, 1980. (201)

In fact, in McDaniel (2007), multiple-choice quizzes produced results that were only slightly better than repeated reading without quizzes. An initial test consisting of multiple-choice questions often fails to produce a testing effect, presumably because such questions require little or no retrieval (e.g., Kang et al., 2007). In this present study, the greatest testing effect is demonstrated with short answer questions rather than multiple

79

choice questions. In a previous study using word lists, McDaniel and Masson (1985) found that cued recall produced significantly better performance on a subsequent cued recall test than did recognition, but importantly half of the time the cues that prompted recall on the final test were different than those that were provided for earlier study and testing. This pattern prompted McDaniel and Masson to suggest that retrieval through recall produces enriched, variable encoding of the target information, more so than retrieval through recognition. McDaniel cites further studies showing positive transfer between testing and studying when the wording in the question stem differs between studying and testing (Glover 1989; Lockhart 1975; Wenger, Thompson & Bartling 1980) and argues that testing on multiple aspects of a question should produce a deeper relational understanding of the question. The findings in this study fit with a larger body of research, including those studies reviewed in this chapter, showing that recall tests are more beneficial than recognition tests for subsequent memory performance.

This section has reviewed the results of studies demonstrating a testing effect in multiple learning conditions. While the studies reviewed here are small in number, they are uniform in suggesting that a balance of studying and testing appears to be the best method to ensure long-term retention. The purpose of testing should be to gradually shape production of the desired response so that it can be retrieved out of context, after a long delay. The landmark studies of Gates and Spitzer established the testing effect as a paradigm of learning, which has remained consistent in the later studies reviewed in this

80

chapter. The implications of this research for composition are that student learning and

retention of course material can be facilitated through frequently-administered low-stakes

tests.

The Spacing Effect

The spacing effect is the principle that spacing study sessions is better for retention than massed study. The spacing effect has two components: 1) spaced study is better than massed study, and 2) the most efficient method to ensure long-term retention is through increasingly spaced repetitions of the original material. German psychologist

Herman Ebbinghaus, who used himself as the sole subject, conducted a landmark study on human memory in the late 1800s which laid the empirical underpinnings for the spacing effect. This study is widely recognized and extensively cited. He memorized thirteen sets of nonsense syllables, and then tested himself at various retention intervals and measured how long it took to forget and then relearn them. These nonsensical three letter sets of consonant-vowel-consonant words (such as YOP, SEP, XAP, etc.) were chosen to avoid contaminating the experiment through prior learned associations. In experiments of astonishing rigor and tedium, Ebbinghaus practiced and recited from memory 2.5 nonsense syllables a second, then rested for a bit and started again.

Ebbinghaus trained this way for more than a year. He then repeated the entire set of

81

experiments three years later to further confirm his findings. Finally, in 1885, he published a monograph called Memory: A Contribution to Experimental Psychology.

Ebbinghaus’s findings and his book established the theoretical foundation for the study of memory that psychology has relied on since. His results have been replicated in numerous studies and serve as a foundational precept for our understanding of human memory and learning. Ebbinghaus identified some important empirical relationships in memory, such as the retention and learning curves. He studied the amount of time it took to learn the list initially and then how long it took to relearn the list, with “learning” defined as the ability to perfectly recall the list twice. In one study he found that it took 1156 seconds to initially learn the set, but later it took only 467 seconds to relearn the list. He found initial forgetting was rapid but the rate of forgetting slows down over time. This was the first expression of what has been found in virtually all studies of human learning since: the negatively accelerated learning curve.

82

Figure 10: “Hypothetical Forgetting Curve 1”

Source: (“Learning by ”)

Robert Bjork, working with (1978) of Bell Labs, published the

results of two experiments involving nearly 700 undergraduate students learning names.

Each student was given a rearranged deck of cards bearing — for initial presentation trials — first

and last names of fictitious people or — for test trials — first names only. Subjects turned

through the cards at a 9 sec. rate in time to a signal, studying and writing last—name answers as

appropriate. Next there was a 30 min. retention interval filled with a distracting lecture, followed

by a final retention test. Landauer and Bjork were looking for the optimal moment to

rehearse something so that it would later be remembered. To determine this, they studied

the effectiveness of an expanding retrieval schedule (i.e. an increase in the retention

interval after every act of retrieval) compared to an equally-spaced retrieval schedule.

Landauer and Bjork found that the expanding-interval schedule produced similar recall as equal-interval testing on a final test at the end of the session, and both produced better

83

recall than did initial massed testing. Their results led them to theorize that the best time

to study something is at the moment you are about to forget it: retrieval right on the

threshold of forgetting produces the greatest gains in retention. In their words: “Successful

tests are more effective than repetitions. This could either be because tests induce greater

encoding effort, or because they are more similar to the performance required at eventual recall”

(631). This is in line with the theory of desirable difficulty described earlier. They found that the expanding retrieval schedule produced a 10% increase in retention over the equal-interval schedule. In practical terms, this finding suggests that tests should be

administered shortly after learning to ensure initial encoding, and then repeatedly

administered at increasingly spaced intervals. An implication of the spacing effect is that spaced study will be more effective

than massed study. Massed study is cramming a lot of material in a short amount of time.

As we know, we can pass tests by cramming if the test is taken shortly after cramming,

but that kind of knowledge has very little storage strength. Dempster (1987) conducted a study in which subjects were showed paired-associate English vocabulary words and their definitions three times. One control group did immediate massed study (cramming), while the other group did spaced practice with other items in between. The second condition resulted in much greater retention than the massed schedule. Carpenter and

DeLosh (2005) also found in their article “Application of the Testing and Spacing Effects to Name Learning,” where subjects learned paired-associates of names and faces, that

84

spaced study resulted in better retention than massed study. Both experiments in that study showed that final retention was better for the spaced conditions than the massed conditions, and this held true for different spacing intervals and for both studied and tested items. All of these studies are in line with a much larger body of literature which reports that retention is better for spaced study than massed study (e.g. Hintzman 1974;

Melton 1970; Dempster 1987) and better for spaced than massed testing (e.g. Cull et al.

1996; Cull 2000; Glover, 1989; Izawa, 1992; Landauer & Eldridge, 1967; Modigliani &

Hedges, 1987). Ruch (1928) published a review of dozens of studies of the spacing effect; for a more recent review, see Cepeda, Pashler, Vul, Wixted, & Rohrer (2006). H.L. Reodiger summarized these findings in the field when he was asked in a

2012 interview “How many times should one get people to retrieve things, and how soon after learning?”: F. Mary Pyc and Katherine Rawson at Kent State University showed that for simple things like foreign language vocabulary, retrieving about five to seven times is about right —if you test people a week later you wouldn’t see much difference between having tested people seven times or ten times, but you do see gains going up to the range of five to seven times. After that it just levels off. But most people would only practice once or twice, so the idea of going up to five or seven retrievals seems like too much to many people. Of course, to keep knowledge at your mental fingertips, you would need continued spaced retrieval practice, too. (Kleeman “Professor Roddy Roediger…”) Five to seven repetitions over the course of a sixteen week semester should be enough. If teachers administered tests once a week, this would provide ample opportunities for five

85

to seven repetitions of each test item, which in turn would induce long-term potentiation. Spaced repetition relies on the principle that information does not have to be repeated every day in order to ensure long-term retention. While repeatedly studying the same information every day would indeed foster long-term retention, it would also be boring and inefficient. Spacing study is a more efficient way of studying. Figures 11 and

12 conceptualize the learning process through spaced repetition. The figures do not depict direct findings from a study, but they do help illustrate the learning curves described in this chapter. This modified graph of a forgetting curve conceptualizes the learning process through spaced repetition. It shows that, from the time when information is first introduced, if no reminder is given, then the likelihood of remembering it dramatically drops in days. The likelihood of correctly remembering an item of information is expressed on the Y axis in terms of 0 to 100%. Time elapsed since the original learning event is represented on the X axis. In the image, a horizontal bar extends from the 90% chance of correct recall—near perfect memory. A negatively sloping curve represents the average forgetting curve and show that forgetting increases rapidly over time. However if a reminder is given before the slope of the curve drops below the horizontal bar which represents 90% likelihood of remembering, then long-term retention can be maintained.

Interestingly, the spacing effect shows us that the period of time needed between future reminders to maintain a memory stability of 90% will increase after each subsequent repetition.

86

Figure 11: “Hypothetical Forgetting Curve 2”

Source: (“Want to Remember Everything You’ll Ever Learn?”)

87

Figure 12: “Hypothetical Forgetting Curve 3”

Source: (“Spaced Repetition and the CFA Exam”)

The studies in this section illustrate that spaced testing is more effective for long- term retention than massed practice, and that it is more efficient than equally-spaced testing. The studies in this chapter reveal the benefits for frequently administered low- stakes tests; these tests would provide structure and support for repetition. If our goal is to maximize learning, then we should design curriculum based on the numerous studies that show repetition is the best way to optimize learning. If we agree that long-term retention of classroom content is a desirable thing, then it becomes clear that we should provide repetitions of class content at spaced intervals to best ensure long-term retention. Because

88

spacing is a more efficient means of studying than massed study, we can see a clear

benefit to designing tests that use the spacing effect. The studies reviewed in this section

show that teachers should repeatedly test on the same items, and gradually reduce the

frequency of those items on subsequent tests as they introduce new items. The studies

support the use of frequent tests in composition to ensure students' long-term retention of key course content.

89

CHAPTER 5: TEST-ENHANCED LEARNING IN COMPOSITION

Indirect Benefits of Testing

Chapter 5 offers a number of specific examples of how our current composition

practice can be enhanced through testing. The composition textbooks which we already

use contain testable content—in fact many rhetoric textbooks contain ready-made tests under the guise of “exercises.” These kinds of “exercises,” as we call them, are structured similarly to tests and can be administered in class to enhance learning. Devising tests and administering them effectively is a complicated art, but also a skill that can be developed.

This chapter presents some general guidelines for how to form tests and how to administer them. The testing effect on its own provides a strong argument for enhancing our current practice by administering frequent, low-stakes tests. However, in addition to the testing effect, there are indirect benefits of testing, which this section will review. The dialogic class that Finkel portrays (described in Chapter 2) assumes that all of

the students have come to class totally prepared by having done the reading, having

thought about the reading, and are interested and ready to engage in an animated

discussion about it. According to authors Jacobs and Chase, writing in their book

Developing and Using Tests Effectively, “If we do not ask questions on the content of

90

outside readings, then most students will not read the materials” (17). Without proper preparation on part of the teacher, Finkel’s class is a beautiful fantasy. To turn this fantasy into a reality, it is the teacher's job to create the conditions that lead to optimal learning. In a collaborative model of learning, learners of all skill levels improve as a result of collaborating within the Zone of Proximal Development but only if those learners are cognitively capable of being in that ZPD; effective collaboration requires capable peers. When students neglect to do the reading or other coursework, then it is impossible to maintain an effective ZPD for everyone in the class. Teachers can help ensure that students are capable peers by requiring them to express course content in their own words through frequent tests.

Frequent testing requires students to space their study efforts throughout the semester rather than concentrating them on cramming right before an exam—a method which research and common experience shows is utterly ineffective in facilitating long- term retention. Roediger and Karpicke (2006) write “To state an obvious point, if students know they will be tested regularly (say, once a week, or even every class period), they will study more and will space their studying throughout the semester rather than concentrating it just before exams (see Bangert-Drowns 1991; Leeming, 2002)”

(249). Frequent tests will prevent students from cramming for high-stakes exams. We all know that with dedicated cramming the night before a test many individuals are able to pass their exams. By cramming, students are able to effectively hold data in their short

91

term memory; however, they cannot perform as well on that same exam a week later or a

month later. Because of the possibility of cramming, tests should not be used as an

exclusive means of assessment; the results will become more reliable if tests are frequent

rather than infrequent. Frequent testing in classrooms encourages students to study

continuously throughout a course, rather than bunching massive study efforts before a

few isolated tests. This process simulates an STST model, the learning model which was

most effective in the studies reviewed in Chapter 4.

Frequent quizzing might also reduce test anxiety, a trouble that plagues many students. Test anxiety generally comes from how heavily exam grades can be weighted , but frequent testing is a low-stakes means of retrieval. Instructors could even start with an ungraded quiz that can serve as a model of future and serve as an assessment of current

student knowledge. Roediger and Karpicke (2006), Pooja et al. (2008), and McDaniel

(2007) all report increased student confidence from taking frequent low-stakes tests. In

their 2006 study, Roediger and Karpicke found that students self-reported increased

levels of confidence as a result of frequent low-stakes quizzes in class. Similarly,

McDaniel and colleagues (2011) write that 64% of the subjects (out of 139 eighth-grade

science students) reported that tests reduced their anxiety of taking the unit exam and

89% reported that the tests increased learning (404). The researchers in that study also reported that they observed disappointment on behalf of the students on days when the quizzes were not included in class.

92

Another benefit of testing is that it helps include shy students who avoid joining

discussion. Many teachers are uncomfortable about calling out students in class, and

because there are students who do not participate in class discussions as much as their

peers, those students have fewer chances to reproduce class content in their own words.

Providing tests can vary the format of presentation to accommodate those students who

do not participate in class discussion or group work as often. Rather than excluding those

who are too shy to pipe in, tests provide alternative retrieval opportunities. In addition, frequently administered tests provide a more holistic approach to

assessment, with assessment occurring over the course of the semester rather than

infrequently through high-stakes tests. Frequent tests give feedback to an instructor which

can help assess student capability and identify misunderstandings. This frequent

assessment provides teachers with the information they need to update the course

curriculum and maximize student learning. If many students are unclear on a particular topic, then test results would make that misunderstanding apparent and would allow instructors to modify their teaching accordingly. In the same way, tests can be used at the beginning of a semester to see what skills students bring to a class. A writing test administered at the beginning and the end of the course can be instructive for composition instructors by demonstrating students’ growth over the semester.

Transparency is another indirect benefit of testing. Tests show students exactly

what content the teacher considers important, which makes learning goals more explicit.

93

Making our expectations for the course clear for students is important because transparency helps direct student study efforts and generates expectancy. If students know what questions they will be expected to answer, then they will read with that goal in mind. Corroborating this, Roediger and Karpicke (2006) report that when students know they will be tested, they will come to class more prepared. Typically students are assigned large amounts of reading, and for many students this is problematic because they do not know what material is important to remember. Obviously they won’t remember the entire book. Generally students will highlight as they read, but without guidelines for what material is important, students often find themselves staring at pages covered in yellow, which undermines the entire purpose of highlighting. With specific reading questions, the teacher is essentially highlighting the reading for the student, and demonstrating the importance of that material by re-exposing it through tests. Reading questions help scaffold student’s learning by emphasizing what they should focus on.

Goal-oriented reading with a study guide and reading questions will help ensure a higher degree of retention of key concepts.

Many instructors utilize web-based instruction such as Moodle or Blackboard, or present content through their own websites. These mediums are well suited to integrate test-based learning because the technology allows for easier managing and conducting of tests. As a graduate student, I taught a section of English 104: “Accelerated Composition and Rhetoric” at Humboldt State University. Tests could be posted on a regular

94

schedule—for example, every Friday, as I did in my class. Moodle already has testing

software built in that can automatically grade many types of test items and provide

feedback immediately after testing. It also logs user information that can be made transparent to the students, such as class participation and grades. In addition, utilizing these web-based mediums for test administration does not require any valuable class time. Test-enhanced learning does not require any substantial change in our current education system, and it works very efficiently with our current web technology.

The capacity for web-based mediums to give instant feedback is truly a boon, because feedback from frequent tests will also help students guide their study efforts.

Feedback of course means knowing if you answered correctly or incorrectly. Feedback can be instantly provided for questions with a simple answer, but for short-answer or essay questions, the instructor will have to give feedback manually. If students test themselves periodically while they are studying, they may use the outcome of these tests as a guide for future study. McDaniel, Mark A. et al. in “Testing the Testing Effect in the

Classroom” corroborates the importance of feedback:

The results are compelling for feedback effects after missing a short- answer quiz item. Clearly, learning and retention were better when students were given feedback after missing a short-answer question than reading the fact (twice) without being quizzed. (505) Frequent test results allow students to self-assess where they are in comparison with the expectations of the teacher and the course, and this feedback gives students the

95

information they need to adjust their studying accordingly. On the other hand, a lack of

feedback can result in continuation of errors—that is, when students answer a question incorrectly on a test, but think they answered it correctly. The research in previous chapters has shown that retrieval helps build long-term retention, so, if students respond and make an error, but did not receive feedback, they may have stamped that error into memory. In other words, because retrieval enhances learning, it is likely they will continue remembering that error. Agarwal, Pooja K. et al. in “Examining the Testing

Effect with Open- and Closed-book Tests.” write

Prior research on the testing effect has shown that if students make errors of commission on an initial test and do not receive corrective feedback, they may retain those errors on later tests and run the risk of incorporating false information into their general knowledge (see Butler, Marsh, Goode, & Roediger, 2006; Roediger & Marsh, 2005). (862)

Giving feedback is as worthy of care, intelligence, and imagination as making up the test in the first place.

With high-stakes exams, it can become a very tragic and emotional situation as students plea for exceptions and makeups. However, low-stakes tests spread out the grading weight over time such that students can miss the occasional test and not have serious concern for passing the course. A further benefit of frequent low-stakes tests is that it allows professors to have a very simple makeup policy for tests: never.

As we have seen, there are many reasons to test. In addition to these indirect benefits, the research reviewed in this project has shown that active retrieval is the best

96

method to ensure long-term retention. Assumedly we attend school in order to learn—if

we are interested in learning material, isn’t it worth remembering it? If this is the case,

then, as teachers, we should try to facilitate long-term retention of course material. I

argue throughout this project that frequent, low-stakes tests are the best way to achieve

this.

Testing and Grades

In order for students to study for tests, they must believe that these tests are

important. Teachers can ensure that students take tests seriously by making tests a

required and graded component of the course. Grades help structure student effort, and

grading tests imparts on students that understanding course concepts is important. I know

from experience (both as a student and as a teacher) that if an assignment doesn't have

some graded value, it is easy to let it slip to the wayside and neglect to do it. Frequently

this is done with the intention of going back to it later only to find that "later" is so full of

critical assignments that there is no time to do the ones that are not required. Reading

assignments can end up being quickly skimmed rather than thoroughly considered.

Writing assignments that are not worth a graded value may be hastily written by students the night before, or directly prior to class. If we consider that students have busy lives outside of the classroom, then students neglecting assignments without a grade value

97

makes more sense. Students are indeed busy outside of the classroom; they are whole

people with complex lives—many students have to juggle a job, their personal or family

lives, and school all at once. Grades can be seen as a system that allows students to assign

value to assignments and prioritize. Assigning a grade value to tests sends a message to

students: “here are the really essential things to learn and remember from this course.” Despite the ubiquity of grades in the university, typically there is no grade value assigned to thoroughly reading and understanding assigned texts. Rather, readings are assigned, and they may be briefly discussed in class, but time constraints prevent thoroughly discussing all assigned material, which means may often not be held accountable for completing assigned readings. Performance in many other aspects of the class is contingent upon an understanding of the assigned texts—class discussions, essay writing, homework assignments, and so on. So we as teachers must ask, “How can we ensure that students have adequately read and internalized class material that will be the focus of discussions?” Professors John-Stiener and Meehan, writing in Vygotskian

Perspectives, have considered this issue as well. They argue that “Shallow internalizations leads to a facile combination of ideas. In contrast, working with, through, and beyond what one has internalized and appropriated is part of the dialectic of creative synthesis” (35). To ensure adequate internalization, we need to develop ways to ensure that students are keeping up with their reading assignments. Otherwise a student who has not engaged with the material may end up just hiding in the crowd during class

98

discussion, and group work will suffer as a result. One method of achieving consistent

completion of assigned reading is through frequently-administered quizzes. As teachers

we can encourage students to study the assigned readings by holding them accountable

during tests.

The Benefits of Tests According to Three HSU Professors

For the purposes of this project, I conducted interviews with many professors at

Humboldt State University. I received feedback from many professors, all of which has

been influential in this project. For purposes of brevity, I have narrowed down these

interviews to three professors. These were personal interviews and by no means should

these reports be considered authoritative or empirical. This is anecdotal information.

Professor Corey Lewis, an instructor of composition with more than thirteen years

of experience, thinks we’ve moved away from testing as part of a pendulum swing away

from top-down pedagogy to Expressivism and student-centered teaching. Current composition practices recognize that students need to write and work with writing to improve, but he argues that with the abandonment of tests we’ve lost what is clearly an effective instructional technique, stating, “I don’t know anyone who is on a regular systematic basis testing students on skills and content that directly relate to writing in composition” (Lewis Interview). He describes having an epiphany while lecturing in

99

class; while discussing some key terms for the class and writing them up on the board, he

looked around the class and noticed that out of 22 students maybe 3 of them were taking

notes. Without taking notes and reviewing them, there is just no way students could

remember a long list of terms like those he had written on the board. Professor Lewis

tells me he assumed that students were trying to learn the terms, but it became clear from

observing students that they were not putting forth the effort necessary to memorize. Today, it seems clear that many students are not putting forth the diligence necessary to learn classroom material. In part this lack of effort may be a result of many students needing to work while goind to school. Today students are faced with tuition that costs a fortune. There are budget cuts, furloughs, and fewer resources available, which means that most students work in addition to going to school. Many students work full-time in addition to being full-time students. Though this may not be the case for every student, it is useful for instructors to adopt the mindset that students are not neglecting their studies because they do not care, but for various and complex personal reasons. This gives educators the choice of either complaining about the current situation, or creating the structure that students need to learn. Teachers can help students by providing a structure for them to succeed. Professor Lewis told me that it was our discussions about testing that brought the need for testing to mind, and he believes that other professors are probably in the same situation. They realize that students aren’t internalizing classroom content, but they are

100

not sure how best to facilitate that learning. Because most current composition and

English methods of instruction train us to use class discussions, workshops, and writing conferences rather than tests, many composition instructors have not been trained to effectively use tests in the same way they have been taught to use other pedagogical methods. Professor Janet Winston, another professor whom I interviewed for this project, told me that the biggest challenge she faces as a literature teacher at HSU is getting students to actually do the assigned reading. Though the focus of this project is teaching composition, not literature, the two fields both necessitate that students put forth the effort to thoroughly read and understand assigned readings. Janet’s success in a literature classroom can be used as an example for the composition classroom. According to

Professor Winston, many of her students come from a lower middle class background-- frequently they are the first in their family to go to college, and generally they have to hold jobs to support themselves on top of their academic duties. When students are stretched so thin, it is understandable that they would ignore assignments that they see as unimportant—namely, assignments without a grade value. Professor Winston reasons that we’ve become so enamored with grades that if an assignment doesn’t have a grade value then students regard it as unimportant. From this perspective, she argues that the lack of a grade value for the assigned reading was the problem. The lack of a grade value assigned to reading comprehension can seem to portray that reading assignments are not

101

important. By utilizing regular quizzes which include questions related to reading assignments, instructors can assign a specific grade value to a thorough understanding of the reading material, which in turn tells students that reading is indeed important.

To fix this problem, Professor Winston began giving reading quizzes once a week that covered topics in the assigned reading. She was reluctant to use reading quizzes originally because they seemed too narrow-minded. She didn’t want students to think about literature from a convergent thinking approach and assume that there was only one correct interpretation. Indeed, Professor Winston found that she had trouble at first formulating open-ended questions that didn’t lock students into a right or wrong response. However, over time she learned how to formulate better questions. Today her questions are generally open-ended and require a paragraph or so of response. Rather than focusing on plot details, they instead require a demonstration of understanding of the broader concepts in the reading and class content. Professor Winston tells me that based on the quiz results she can say with confidence that at least 90% of the class is keeping up with the reading—a participation rate that many instructors would find enviable. Winston reports that this participation rate has resulted in enhanced class discussion. I also interviewed Professor Robert Cliver, in the history department. Professor

Cliver tells me that history is all about reading and writing—it’s about narratives, not just memorizing facts. He reports that indeed the facts are important, but understanding their interrelations and discussing them is what makes a good historian. Rote memorization is

102

what machines do; creative synthesis is the mark of a good scholar. These are the same sort of qualities that we want to impress on students in a composition course as well.

Professor Cliver told me that when he first started, he didn’t give many quizzes. He quickly learned from student feedback however that students wanted more tests. He realized from student feedback and the results of giving quizzes, that tests can help incentivize doing the work and encourage students to come to class. According to

Professor Cliver, a single final exam doesn’t help—“it just tests if students are good at

taking tests” (Cliver Interview). He came to learn that the traditional comprehensive end-

of-term exam is useless because having few high-stakes exams encourages cramming and

putting off work. For these reasons, Professor Cliver now gives frequent tests at the end

of class, which serve as a review for material covered during lectures and in the readings.

Though tests could be viewed as a “police measure” to simply ensure that students have

done the reading, Professor Cliver argues that tests are not just busy work, because the

activity teaches students what course content should be emphasized. In addition to the quizzes, Professor Cliver also utilizes take home exams. He

reasons that they help students practice writing. The combination of these two activities is

what he finds most successful. Quizzes help students internalize fact-based course

content, and the take home quizzes necessitate students using that knowledge for a

creative synthesis in the form of writing. He shared an example question that might

appear on a take-home exam: “Describe how the end of WWII affected east Asia.” In

103

order to answer this question, students must have a firm grounding of the historical facts,

such as the Communist party coming to power, the alliance of Japan and the US, and the

division of Korea. The facts are necessary, but simply listing these facts is not enough for

good writing. The combination of frequent in-class tests and take-home essay exams

gives Cliver's students multiple opportunities for retrieval, as well as for the creative synthesis of facts.

Building, Administering, and Grading Tests

We don’t want students to just memorize facts; we want them to demonstrate they understand the principles behind the material and apply that learning to new situations.

To achieve this goal with tests, we must have a systematic structure—something to tie the test type and content to course objectives. Bloom (1956) describes six cognitive skills in a hierarchy from simple to complex: (1) knowledge, (2) comprehension, (3) application,

(4) analysis, (5) synthesis, and (6) evaluation. Instructors should understand these six categories, and clearly understand which one of these categories any particular test item falls into. Developing and Using Tests Effectively provides a list of the kind of question

wording each of these types may use, which I have abridged and re-presented here. (1)

Knowledge questions involve the recall of learned material through activities such as

remembering facts, definitions, or principles. Common questions testing knowledge

104

involve wording such as: “define,” “list,” “state,” “identify,” “label,” ”name,” “who,”

”when,” “where,” and “what.” (2) Comprehension questions require a more in-depth understanding of learned material. Question wordings typically include: “explain,”

“predict,” “interpret,” “infer,” “summarize,” “convert,” “translate,” “give example,”

“account for,” or “paraphrase.” (3) Application is the ability to use (transfer) learned material in a new situation or context. Application questions ask students to use concepts to solve a problem. Typical wording or questions include: “apply,” “solve,” “show,”

“make use of,” “modify,” “demonstrate,” “compute.” (4) Analysis questions ask students to break down material into component parts so that the organizational structure is understood. Typical question wordings include: “differentiate,” “compare and contrast,”

“distinguish ___ from ____,” “how does ___ relate to ____,” or “why does ____ work?”

(5) Synthesis asks students to put parts together to form a new whole. Question wordings include: “design,” “construct,” “develop,” “formulate,” “imagine,” “create,” “change,” or

“write a poem or short story.” Finally, (6) evaluation questions judge the value of material for a given purpose using definite criteria. Typical question wordings include:

“appraise,” “evaluate,” “justify,” “judge,” “which would be better?” Considering their local situation and their class needs, instructors should develop a table of the types of questions they want to ask (e.g. 30% knowledge, 50% comprehension, 20% application).

This will help teachers plan and create tests, and ensure a balance of question types.

Having a mix of question types capitalizes on a balance of their various strengths.

105

When generating tests, we want tests to match the students’ expectancy. Since we

are asking students to spend time studying for these tests, it is important to help students

understand what material is important, and what they can expect from a test. We can

improve student expectancy by clearly designating test dates and the approximate amount of time the test will take. Expectancy will further be increased if the instructor ensures that all students take the same test with the same questions at the same time. We should also indicate the approximate worth of test questions and the amount of time that students should spend on particular tests According to Developing and Using Tests Effectively, as a rule of thumb, allow about one minute per item with multiple choice and half a minute for each true/false item. Short answer questions requiring a sentence or two will take about two minutes to answer, and we should allow ten to fifteen minutes for a short essay and thirty for an essay requiring two to three pages (72).

There are also some important things to understand regarding grading tests. The location of the student’s paper in the stack can have an influence on the score assigned to it by the reader. According to Bracht and Hopkins, (1968) the first papers read tend to get higher scores than later ones. The reader tends to judge a paper harshly if it is preceded by a well-written paper; if the previous paper is poorly written, the essay is judged generously. Several studies show that the quality of handwriting, grammar, and spelling

(James 1927; Sheppard, 1929; Chase, 1968; Marshall and Powers, 1969) can all have an impact on the scores given to as essay. We can improve reliability of scoring by

106

concealing student’s names until after a score has been assigned. This keeps instructors’ achievement expectations for their students from affecting their judgment of essays.

Other techniques include, reading only one item across all tests before going to the next item, then reshuffling the stack of papers before going to the next. Reshuffling guarantees that no paper will repeatedly suffer from following a good paper and none will reap the advantage of repeatedly following a poor one.

According to Developing and Using Tests Effectively, instructors often give inadvertent clues on tests which help students guess the correct answer. Because these clues make questions easier, they also undermine the students’ learning process

(according to the theory of desirable difficulty). Specific determiners (e.g. “all” or

“always”) depict a situation as absolute or as qualified in a way that can lead a student to guess that the question is probably false. Qualifying terms such as “sometimes,”

“usually,” or “typically” are uncertain enough to suggest that the question is more likely to be true. A question wording like “The answer is a ______” has an embedded grammatical clue. In this case only answers that begin with a consonant would be grammatically correct, so we should modify the original to state “The answer is a(n).” By the same token, we should make all blanks the same length so that they do not provide a clue to the length of the answer. Other aspects to avoid include using “all of the above” or

“none of the above,” because these are too easy as distractors. Mentzer (1982) examined thirty-five files of multiple choice test items for evidence of biases in the correct answers.

107

The most frequently occurring bias in that set was the “all of the above” response, which

was correct more than 25% of the time. If you do use multiple-choice questions, make

sure you mix them so they are not always correct. Furthermore, the correct answer should

be placed randomly, rather than in a favored letter. For example, option ‘C’ is over-

utilized as the correct response according to the same study, and option ‘A’ is

underutilized. Place the correct answer in each of the alternative positions approximately

an equal number of times but in a random order. Furthermore, avoid vague indefinite

terms denoting degree or amount, such as in a question like “T/F: A long time ago trees

covered a very large part of present-day Wyoming.” “A long time ago” could mean anything from 1850 to 10 million years ago. Indefinite terms will make scoring less reliable and probably confuse the test taker. Instead, use definite terms which allow for only one correct response.

Recognition and Free-Response Questions

It is commonly accepted that, as a part of teaching writing, critical thinking is a main component or aim of instruction. Critical thinking is both a method of thought and a complex set of varied skills. Part of critical thinking is the ability to consider multiple positions and reason towards the most likely conclusion. Tests, if used properly, can encourage the development of critical thinking. Some forms of testing are better at

108

developing critical thinking skills than others such as essay or short-answer formats.

Short-answer questions are particularly effective for composition instruction and developing critical thinking. For example, questions such as “What do you think was the most effective form of evidence used in this article?” or “How did the author deploy pathos to advance her argument?” not only require a sophisticated understanding of the material, they also require critical evaluation of the author’s argument, and personal articulation using the disciplinary language of composition.

There are two main types of test questions: recognition and free-response. Essay questions and short answer questions are free-response. There are several advantages of using free-response questions. They are best when assessing complex learning outcomes.

They are also relatively easy to construct. They also do not permit students to get a score by guessing or bluffing (in most cases). However, the limitations are that they are difficult to score, much more time consuming to grade than recognition questions, scoring is more subjective, and a test consisting primarily of free response questions limits the sampling of content due to time restraints. Recognition tests, such as matching, cloze deletion, T/F, or multiple choice also have benefits and limitations. Recognition tests allow for inclusion of much more content in a test than free response. Recognition type items (multiple choice, T/F, matching) require students to select the correct answer among several options. Recognition tests allow for more questions because each takes less time to answer, which broadens the coverage of content. These qualities makes

109

recognition tests difficult to construct, but easy to score. Because recognition items narrow down the range of possible answers, they are susceptible to guessing. However this also makes scoring more objective and reliable.

Free-response requires students to organize and express answers in their own words. Limiting the breadth of the essay question allows the answer to be relatively brief and specifically tied to a single objective. A broad question like “What were the conditions that led to the Civil war?” isn’t as good as a narrower question like “Compare and contrast the role of agriculture in the economies of the North and South at the outbreak of the Civil War.” A narrower question will produce a narrower response. This will in turn improve the reliability of scoring. Rather than assessing factual content, a method much better suited to other test formats, the essay test should be used for assessing outcomes that require higher-level cognitive functions. Some examples that are appropriate for essay questions are the following prompts: “Present arguments for and against ______,” “Illustrate how a principle explains facts,” “Illustrate cause and effect,” “Describe an application of a rule or principle,” “Evaluate the adequacy, relevance, or implication of this data,” “Form new inferences from data,” “Organize the parts of a situation, event, or mechanism and show how they interrelate into a whole,” or

“Sort out the relevant parts as distinct entities from a total situation, event, or mechanism.”

The “stem” is the heart of a recognition test question. The stem should present the

110

problem with precision and clarity. Wordy problems need to be modified to reduce

unnecessary information. Always positively state the question stem; otherwise, call the

students’ attention to the negative. After writing the stem, write one correct or clearly

best answer, and three or four plausible distractors. This will help include as much of the

item material as possible in the stem, and prevent repeating words or phrases in each

distractor that could be put in the stem one time. Instruct students to choose either the

correct answer or the best answer. Some questions will have multiple possibilities but one

best answer that experts would agree on.

Writing distractors is probably the most difficult and most important part of

building multiple-choice items. Distractors should be designed around common errors

that students make or misconceptions they may have. A useful strategy in designing

distractors is to phrase an item in the form of a completion of a short-answer question.

Think of the incorrect response that students would be likely to make to the question and let these be the distractors in the multiple-choice item. The distractors must be incorrect but they should have enough plausibility to attract students who do not know the material very well. Avoid writing absurd distractors. While they may be humorous or light hearted, they increase the likelihood of students guessing the correct response because they narrow the range of possible answers.

In cloze deletion, such as “There are _____ members of the U.S. house of representatives and _____ members of the senate,” students get minimal cues and must

111

construct the answer. Cloze deletion items should be answered with a single word or

prhase, and statements should be worded so that they have only one right answer. For

example “The battle of Lexington was fought in ______” can be answered in several

ways, and it should be reworded to “The battle of Lexington was fought in the year

_____.” While in multiple choice tests, the students must only recognize the correct

response and choose it among the responses given; therefore the demand on students is

greater for cloze deletion items.

Also bear in mind the saying “a picture is worth a thousand words.” Graphical occlusion can work very well for charts or mind maps or images with captions. Graphic

deletion works like cloze deletion but instead of a missing phrase it uses a missing image

component. Mind maps, charts, and other diagrams can be effective learning tools and

graphical occlusion allows for the information in these images to be tests on.

Concluding Discussion

There is much disciplinary content that can and should be tested on in composition. First, for example, we can test students on the various stages of the writing process and the different strategies that can be used in each one. Similarly, there are many disciplinary specific terms that students need to know, such as “diction,” “syntax,”

“thesis-driven,” “transitions,” “juxtaposition,” and so on. And, since we teach several

112

different genres in composition the conventions of each genre such as the use of character

development and dialogue in personal essays, and logical arguments and textual evidence

in persuasive research writing can be tested on. And, of course, students can be tested on

their knowledge of grammar and punctuation rules and their ability to fix editorial errors.

I have argued in this project that tests can increase the knowledge that students have at

their disposal. Tests, alongside our current practices in composition (student-centered, dialogic, multiple drafts, and active writing time) should improve creativity, and help students gradually improve their writing. Unfortunately, tests are widely reviled for their excessive role in assessment, which can result in a blanket dismissal of all forms of testing. However, there is abundant research which shows that tests can be very effective learning tools. Studies in cognitive science and neurophysiology both show that retrieval practice is a necessary condition for long-term retention, and tests can provide retrieval opportunities for students. It is my hope that this project will encourage instructors of all kind, but especially composition instructors, to integrate frequent tests into their own classes.

113

WORKS CITED

Abercrombie, M. L. J. The Anatomy of Judgment; an Investigation into the Processes of

Perception and Reasoning. New York: Basic, 1960. Print.

Ambrose, Susan A. How Learning Works: Seven Research-based Principles for Smart

Teaching. San Francisco, CA: Jossey-Bass, 2010. Print.

Aristotle. Aristotle's Psychology: A Treatise on the Principle of Life. S.l.: Hardpress,

2013. Print.

Bacon, Francis. The New Organon. Ed. Lisa Jardine and Michael Silverthorne.

Cambridge: Cambridge UP, 2000. Print.

Bakhtin, Mikhail Mikhaĭlovich. The Dialogic Imagination: Four Essays. N.p.: U of

Texas, 1981. Print.

Ball, Arnetha F., and Sarah Warshauer. Freedman. Bakhtinian Perspectives on Language,

Literacy, and Learning. Cambridge, UK: Cambridge UP, 2004. Print.

Bruffee, Kenneth A. "Collaborative Learning and the ‘Conversation of Mankind’."

College English 46.7 (1984): 635-52. Web.

Byrne, John H. "Synaptic Transmission in the Central Nervous System." Neuroscience

Online: An Electronic Textbook for the Neurosciences. The University of Texas

Medical School at Houston, 12 Mar. 2014. Web.

114

Carpenter, Shana K., and Edward L. DeLosh. "Application of the Testing and Spacing

Effects to Name Learning." Applied Cognitive Psychology 19.5 (2005): 619-36.

Web.

Carter, Michael, C. Miller, and A. Penrose. "Effective Composition Instruction: What

Does the Research Show?" Communication in Science, Technology and

Management 3rd ser. (1998): n. pag. Web.

Cepeda, Nicholas. "" in Verbal Recall Tasks: A Review and

Quantitative Synthesis." Psychological Bulletin 132.3 (2006): 354. Web.

Chase, Clinton I. "The Impact of Some Obvious Variables on Essay Test Scores."

Journal of Educational Measurement 5.4 (1968): 315-318.

Cliver, Robert. "Discussions on Testing." Personal interview. 10 May. 2012.

Connors,, Robert J., and Andrea A. Lunsford. "Frequency of Formal Errors in Current

College Writing." College Composition and Communication 39.4 (1988): 395-

409. Web.

Cull, William L. "Untangling the Benefits of Multiple Study Opportunities and Repeated

Testing for Cued Recall." Applied Cognitive Psychology 14.3 (2000): 215-35.

Web.

Dempster, Frank N. "The Situation with Respect to the Spacing of Repetitions and

Memory." Journal of Verbal Learning and Verbal Behavior 9.5 (1970): 596-606.

Web.

115

Dempster, Frank N. "The Spacing Effect: A Case Study in the Failure to Apply the

Results of Psychological Research." American Psychologist 43.8 (1988): 627.

Web.

Dempster, Frank N. "Spacing Effects and Their Implications for Theory and Practice."

Educational Psychology Review 1.4 (1989): 309-30. Web.

Donald, Morris C., John D. Bransford, and Jeffery J. Franks. "Levels of Processing

Versus Transfer Appropriate Processing." Journal of Verbal Learning and Verbal

Behavior 16.5 (1977): 519-33. Web.

Elbow, Peter. Writing Without Teachers. New York: Oxford UP, 1973. Print.

Fernanda, Santos. "Teacher Survey Shows Morale Is at a Low Point." The New York

Times, 7 Mar. 2012. Web.

Finkel, Donald L. Teaching with Your Mouth Shut. Portsmouth, NH: Boynton/Cook,

2000. Print.

Fisher, Ronald P., and Fergus I. Craik. "Interaction between encoding and retrieval

operations in cued recall." Journal of Experimental Psychology: Human Learning

and Memory 3.6 (1977): 701.

Fleming, Gerald J., and Meredith Pike-Baky. Rain, Steam, and Speed: Building Fluency

in Adolescent Writers. San Francisco: Jossey-Bass, 2005. Print.

Freire, Paulo. Pedagogy of the Oppressed. New York: Herder & Herder, 1971. Print.

116

Glover, John A. "The 'Testing' Phenomenon: Not Gone but Nearly Forgotten." Journal of

Educational Psychology 81.3 (1989): 392. Web.

Graff, Gerald, Cathy Birkenstein, and Russel Durst. They Say / I Say: The Moves That

Matter in Academic Writing : With Readings. 2nd ed. New York: W.W. Norton,

2012. Print.

Halliday, Michael. "Towards a Language Based Theory of Learning." Linguistics and

Education 5.2 (1993): 93-116. Web.

Hintzman, Douglas L. "Judgments of Frequency and Recognition Memory in a Multiple-

Trace Memory Model." Psychological Review 95.4 (1988): 528. Web.

Jacobs, L. C., and C. I. Chase. Developing and Using Tests Effectively: A Guide for

Faculty. San Francisco, CA: Jossey-Bass, 1992. Print.

Jacoby, Larry L. "On interpreting the effects of repetition: Solving a problem versus

remembering a solution." Journal of verbal learning and verbal behavior 17.6

(1978): 649-667.

James, H. W. "The Effect of Handwriting upon Grading." The English Journal 16.3

(1927): 180-185.

Kang, Sean HK, Kathleen B. McDermott, and Henry L. Roediger III. "Test format and

corrective feedback modify the effect of testing on long-term retention."

European Journal of Cognitive Psychology 19.4-5 (2007): 528-558.

117

Kastenbaum, Steve. "The High Stakes of Standardized Tests." Schools of Thought. CNN,

26 Mar. 2012. Web.

Keyes, Ralph. The Writer's Book of Hope: Getting from Frustration to Publication. N.p.:

Macmillan, 2003. Print.

Klein, Stephen B. Learning: Principles and Applications. N.p.: Sage Publications, 2011.

Print.

Knight, J. K., and W. B. Wood. "Teaching More by Lecturing Less." Cell Biology

Education 4.4 (2005): 298-310. Web.

Kolers, Paul A., and Henry L. Roediger, III. "Procedures of Mind." Journal of Verbal

Learning and Verbal Behavior 23.4 (1984): 425-49. Web.

Landauer, Thomas K., and Lynn Eldridge. "Effect of Tests without Feedback and

Presentation-Test Interval in Paired-Associate Learning." Journal of Experimental

Psychology 75.3 (1967): 290. Web.

Lee, Carol D., and Peter Smagorinsky. Vygotskian Perspectives on Literacy Research:

Constructing Meaning through Collaborative Inquiry. Cambridge: Cambridge

UP, 2000. Print.

Lewis, Corey. "Discussions on Testing." Personal interview. 4 Apr. 2012.

Marshall, Jon C., and Jerry M. Powers. "Writing Neatness, Composition Errors, and

Essay Grades." Journal of Educational Measurement 6.2 (1969): 97-101.

118

Marsh, Elizabeth J. "The Memorial Consequences of Multiple-Choice Testing."

Psychonomic Bulletin & Review 14.2 (2007): 194-99. Web.

McDaniel, Mark A., Henry L. Roediger, and Kathleen B. McDermott. "Generalizing

Test-Enhanced Learning from the Laboratory to the Classroom." Psychonomic

Bulletin & Review 14.2 (2007): 200-06. Web.

McDaniel, Mark A. "Testing the Testing Effect in the Classroom." European Journal of

Cognitive Psychology 19.4 (2007): 494-513. Web.

Melton, Arthur W. "The Situation with Respect to the Spacing of Repetitions and

Memory."" Journal of Verbal Learning and Verbal Behavior 9.5 (1970): 596-606.

Web.

Mentzer, Thomas L. "Response biases in multiple-choice test item files." Educational

and Psychological Measurement 42.2 (1982): 437-448.

Morris, C. Donald, John D. Bransford, and Jeffery J. Franks. "Levels of processing

versus transfer appropriate processing." Journal of verbal learning and verbal

behavior 16.5 (1977): 519-533.

Moscovitch, Morris, and Fergus Craik. "Depth of processing, retrieval cues, and

uniqueness of encoding as factors in recall." Journal of Verbal Learning and

Verbal Behavior 15.4 (1976): 447-458.

119

Pellegrino, Pellegrino, Naomi Chudowsky, and Robert Glaser, eds. Knowing What

Students Know: The Science and Design of Educational Assessment. N.p.:

National Academies, 2001. Print.

Perry, Andre. "Education Reform Starts With Community Reform." Drandreperry.com.

N.p., 25 Feb. 2012. Web.

Phillips, Cecilia. "The Basics: Ion Channels Underlie Neuron Communication." Whirling

Whips: News and Stories about Neurotoxins. N.p., 12 Mar. 2014. Web.

Robinson, Ken. "Changing Education Paradigms." Youtube.com. RSA Animate, The

Royal Society of Arts, London, 2010. Web.

Roediger, Henry L., David A. Gallo, and Lisa Geraci. "Processing approaches to

cognition: The impetus from the levels-of-processing framework." Memory 10.5-

6 (2002): 319-332.

Roediger, Henry L., and Jeffrey D. Karpicke. "The Power of Testing Memory: Basic

Research and Implications for Educational Practice." Perspectives on

Psychological Science 1.3 (2006): 181-210. Web.

Roediger, Henry L., and Jeffrey D. Karpicke. "Test-Enhanced Learning: Taking Memory

Tests Improves Long-Term Retention." Psychological Science 17.3 (2006): 249-

55. Web.

Roediger, Henry L. ": Retention Without Remembering." American

Psychologist 45.9 (1990): 10-43. Web.

120

Runquist, Willard N. "Some effects of remembering on forgetting." Memory & Cognition

11.6 (1983): 641-650.

Ruch, Theodore C. "Factors Influencing the Relative Economy of Massed and

Distributed Practice in Learning." Psychological Review 35.1 (1928): 19. Web.

Sapolsky, Robert M. Biology and Human Behavior: The Neurological Origins of

Individuality. Chantilly, VA: Teaching, 2005. Print.

Shepherd, Everett M. "The Effect of the Quality of Penmanship on Grades." The Journal

of Educational Research (1929): 102-105.

Tate, Gary, Amy Rupiper, and Kurt Schick. A Guide to Composition Pedagogies. New

York: Oxford UP, 2001. Print.

Vygotsky, Semenovitch L. Mind in Society: The Development of Higher Psychological

Processes. Trans. Michael Cole and Vera John-Steiner. Cambridge, MA: Harvard

UP, 1978. Print.

William, Cull L., John J. Shaughnessy, and Eugene B. Zechmeister. "Expanding

Understanding of the Expanding-Pattern-of-Retrieval Mnemonic: Toward

Confidence in Applicability." Journal of Experimental Psychology: Applied 2.4

(1996): 365. Web.

Winston, Janet. "Discussions on Testing." Personal interview. Feb. 2012.