The Study of and in Theory and Practice

Kenny L. Hicks and Randall W. Engle

Georgia Institute of Technology

1 Abstract

Despite decades of scholarship devoted to its importance, the cognitive drivers of creative behaviors and products remain poorly understood. Although previous research has proposed a relationship between the convergent processes of creativity and higher order cognition, studies investigating the relationship between divergent thinking and fluid intelligence have revealed little to no relationship between the two. In this article, we review the noteworthy scholars and debates in the field of creativity and the various methodological approaches used to define the construct. We propose that previous failures to find a relationship between intelligence and creativity are due, in part, to researchers’ emphasis on the differences between the two constructs instead of focusing on their commonality. In this study, we view the creativity construct through the lens of problem-solving across spatial and verbal domains. Using a simple scoring procedure that rates verbal and spatial creative responses, we show some of the most robust relationships between measures of creativity and intelligence to date. Further, our results demonstrate that creativity, verbal fluency, and fluid intelligence share many of the same cognitive processes.

Keywords: Creativity, Divergent Thinking, Convergent Thinking, Fluid Intelligence, Working

Memory Capacity, Torrance Test of Creativity

An Introduction to Creative Faculties

2 For well over a century, researchers have tried to understand the nature of creativity through individual differences in both the quality and quantity of creative ideas. H. L.

Hargreaves was one of the first to conduct empirical research on the issue in his large-scale correlational study of creativity in 1927. The work was ahead of its time, and it received little attention. This was likely due to Hargreaves finding that creativity consisted of multiple factors, which contradicted the g theory put forward by his mentor, Charles Spearman (1927).

Hargreaves was also the first researcher to combine both the experimental and correlational methods to study creativity across a wide range of intelligence and memory tasks. What is unique about this early work on creativity is that it is the first attempt to distinguish the quantity of ideas a subject can generate (fluency) and the quality of the ideas they could generate

(originality). Although Hargreaves speculated that both intelligence and memory were important factors in the creative process, he could not commit to a clear theoretical account for two reasons. First, many of the tests he used demonstrated poor reliability. The second issue is that he could not distinguish between the number of ideas a subject could generate and the quality of those ideas. This result has been replicated many times and has led many researchers to conclude that verbal fluency and creativity are in fact the same ability (Simonton, 1985). In other words, subjects who generate more responses are more likely to generate higher quality responses.

Unfortunately, the field of creativity hasn’t advanced much since Hargreaves’ work, and the cognitive mechanisms driving creative thought remain poorly understood.

Researchers investigating the nature of intelligence have become a victim of their own success to some degree. The rise of better methodologies, paradigms, and theoretical accounts of the construct have led to widespread disagreements about its underlying nature. Although research interest in the field of creativity rivals that of intelligence, researchers have made little

3 progress toward understanding or measuring the construct. Unlike other areas of psychological research and practice, the study of creativity has fallen behind in both theoretical and applied advancements. This criticism is echoed in nearly every review of creativity research since the

1960s, across several branches of psychology including educational psychology (Jackson &

Messick, 1965; Cronbach, 1968; Hocevar, 1979: 1981; Plucker, Qian, & Wang, 2011), organizational psychology (Mumford & Gustafson, 1988), cognitive psychology (Horn, 1976;

Eysenck, 1994), and more recently in neuroscience (Dietrich, 2006).

Human Abilities

In the concluding remarks of Human Cognitive Abilities, John Carroll wrote that no factor has “scientific status unless it is restated to refer to the abilities that compose it” (1974). In other words, the most informative constructs are those that are linked to more simple cognitive processes. Understanding how subjects use and allocate cognitive resources during activities that require novelty will be an important part of any training or development program in creative problem-solving. The processes that give rise to creative thought and skill reveal what learning goals and potential transfer, if any, we can expect from these efforts (Cronbach & Snow, 1977;

Snow & Lohman, 1984).

Spearman’s discovery of g relied on observations of positive manifold, which is defined as the common variance among tests of mental ability. The emergent quality of g became a powerful predictor of individual differences in selection, but its underlying nature remained a theoretical black-box. To uncover the processes that govern it and other mental abilities of potential value such as creativity, early theorists in the field of intelligence developed the underlying mathematics and instruments known today as psychometrics (Pearson, 1908;

Spearman, 1904; Thurstone, 1931). Spearman’s unitary theory might have stayed a monolith in

4 the world of intelligence if Thurstone had not proposed his own perspective on the multidimensional nature of intelligence. With this insight, he took Spearman’s perspective and transformed it into a new method referred to today as factor analysis (Thurstone, 1938a).

Researchers could then ask new research questions and test new theories about mental abilities, both related to and distinct from intelligence. We base our current understanding of human abilities on these early advances in classical test theory.

In response to these issues, we begin our paper with a historical review of creativity, highlighting the methodological and practical challenges that have come to define its measurement. We examine the key concepts, noteworthy scholars, and crucial debates that are most relevant to the study of creativity. Creativity research is represented by a wide range of disciplinary perspectives, theoretical orientations, and methodological practices. The validity and real-world relevance of this work depends on our ability to create and discover common ground.

Creativity as Problem-Solving

Numerous research papers have been devoted to the problem of determining whether creative thinking is any different from the more traditional concept of intelligence. When evaluating complex constructs such as creativity, cognitive and educational psychologists have noted the similarities between popular task paradigms used in creativity and problem-solving tasks. Several researchers have described creativity tests as an advanced type of problem-solving

(Newell, Shaw, and Simon, 1958; Vincent, Decker & Mumford, 2002). This link is not surprising. Tests of creativity and problem-solving both require idea and hypothesis generation, use of heuristics, evaluative thinking and likely share many common cognitive processes.

Before we discuss creativity in relation to problem-solving or any particular cognitive abilities such as working memory capacity and attention, it's important to first describe the structure and

5 defining characteristics of tasks used to study creativity. Tests of creativity can be categorized as ill-defined or well-defined. A divergent thinking task is an archetypal example of an ill-defined problem (see Figure 1A). Tasks like this are unstructured, usually only guided by a short prompt or instruction (Getzels & Csikszentmihaly, 1976). Subjects are given a problem and asked to generate multiple novel strategies to solve the problem. There is no single correct or even optimal strategy for solving the problem.

Figure 1. A) Divergent thinking task (ill-defined) (Torrance, 1990). B) Remote Associates Test item (well-defined). Item from the Remote Associates Test (Mednick & Mednick, 1967).

Some researchers measure creativity with well-defined tasks (Figure 1B). A popular example is The Remote Associates Test, a convergent thinking task (Mednick & Mednick,

1967). These tasks share many defining characteristics with traditional measures of reasoning.

Both have a clearly defined goal and a single, correct solution. During a remote associates task,

6 subjects see three word cues (associates). They have to think of a word that ties the three words together. Unlike tasks that feature ill-defined problems, these tasks require a single, correct solution.

Like the first group, intelligence tests developed by the American psychologist Lewis

Terman and the quick-scoring mental ability tests that followed from his student Arthur Otis, most of the cognitive tests we have used in psychological research feature well-defined problems. Well-defined problems are objective, governed by a strict set of rules that dictate how it should be solved. Ease of scoring and compiling normative data from large-scale datasets made these tasks optimal for selection and development in military and in after WWI.

As an example, refer to a matrix reasoning problem from the Raven’s Progressive Matrices Test

(RAPM), a test that measures fluid intelligence (Figure 2). This test item requires test-takers to review a series of figures so they can determine a pattern and identify which figure should come next in the sequence.

7 Figure 2. Matrix Reasoning item from Ravens Advanced Progressive Matricies (RAPM) (Raven,

1990). Subjects studies the pattern in the series of shapes and deduces the next shape in the series.

Another construct of relevance to the study of creativity is verbal fluency. The verbal fluency task paradigm is influenced by Thurstone's Word Fluency Test, a component of his

Primary Mental Abilities Test (Thurstone, 1938; Thurstone & Thurstone & Thurstone, 1949).

Similar to divergent thinking, verbal fluency tasks require the subject to generate a list of ideas or responses to a given prompt. The key difference is that while there can be many correct solutions, the problem space itself is considered to be more structured and predictable because the parameters for solving it are clear (for instance, the straightforward instruction to list ‘animal names’ or ‘words beginning with the letter c’). Interestingly, tasks that present subjects with ill- defined problem spaces are thought to be more complex and cognitively demanding than well-

8 defined tasks that can be solved more in a more prescriptive manner (Simon, 1973; 1978,

Newell, Shaw, & Simon, 1962). Tasks of verbal fluency are also incredibly multifaceted. Writing down animal names or occupations seems simple, but it’s highly demanding in terms of attention control. A subject’s search for appropriate responses involves strategic search of memory. It’s also important that the subject inhibit or forget previous responses or response categories when they are no longer fruitful (Shipstead et al, 2016). In addition, success on the task depends on a subject’s ability to produce novel strategies that help them move away from pre-potent responses and onto new ideas and exemplars. In line with the optimal foraging perspective, subjects that perform the best are better at knowing when it’s time to change strategies.

It isn’t clear, however, that creative thought engages fundamentally different cognitive processes than other more well-defined problem-solving activities. For example, even though many researchers suggest that divergent and convergent production are part of the creative process (Carroll 1993; Runco, 1994; Eysenck, 2003; Cropley, 2006), there isn’t enough empirical support to date to endorse this view in one way or another. Further, it’s also possible that open-ended tasks that require subjects to generate ideas, whether they be original or based on cues, rely on the same cognitive mechanisms. Creativity researchers tend to disagree over which paradigm –divergent thinking, convergent thinking, or even fluency – is most fitting for the measurement of creativity. Their preference for one paradigm over another makes it difficult to build on previous work. One of our primary research aims is to explore the nature of creativity. We accomplish this in one way by studying its relationship to convergent thinking.

Further, we also examine the relationship between creativity and verbal fluency. Specifically, we assess whether tests of creativity can be equated with tests of verbal fluency, exploring several of the primary measurement challenges in previous research on this topic.

9 In the next section, we describe the next turn in creativity research. Most scholars and historians of creativity are aware that J. P. Guilford’s Structure of (SOI) paved the way for research on creativity (1956). Notably, his 1950 American Psychological Association address on creativity galvanized interest in the subject. Nearly all research on creativity references this keynote address, but few take the time to describe its specific context.

The “Structure” of Creativity

In the 1950s and 60s, Guilford was motivated by interests in U.S. educational reform and mounting pressure from the government to improve personnel selection. In his 1950 address, he argued that the general intelligence measures of the day including the Standford-Binet, Otis,

Wechsler Adult Intelligence Scale (WAIS), and Raven's Progressive Matrices failed to tap what he viewed as our "most important national resource" - creative abilities. Guilford’s answer was a research program focused on finding unique abilities that were unrelated to intelligence.

During this time, Guilford and his colleagues worked at the US Air Force Personnel and

Training Research Center where they focused on adapting dozens of classic tests, and developing new ones, to measure creativity. The most enduring of these are the divergent thinking tasks

Unusual Uses (1960) and Alternate Uses where subjects think of as many novel ways as possible to use common household objects (e.g., a brick or a knife). Another popular divergent thinking measure still in use today is the Consequences test where subjects are asked to describe what problems might result from a hypothetical situation – What if people no longer needed sleep?

(Christensen, Guilford, & Wilson, 1959).

Guilford based his Structure of Intellect (SOI) model on many of his large-scale factor analytic studies of creativity. Of the 100+ factors included in the SOI model, Guilford identified sixteen factors related to creative production. The most enduring of these factors are named in

10 Figure 3 (for a complete list, please see Guilford, 1967, p. 138). Further details on this model are beyond the scope of this paper (see Carroll, 1968b, 1972). However, it is important to know that the intelligence community ultimately rejected Guilford’s SOI due to flaws in the factor analyses

Guilford used to develop it (Cronbach & Snow, 1977; Eyesenk, 1967; Vernon, 1961). Horn and

Knapp (1973;1974) demonstrated that the method used as evidence for the Structure of Intellect had no validity. In fact, it fit randomly generated data as well as data collected from actual subjects. Subsequent re-analyses of Guilford’s own data failed to support the Structure of

Intellect model (Harris & Liba, 1965). Carrol writes that Guilford’s SOI model is an “aberration in the history of intelligence models” (1993, p. 60). Carroll goes on to say “that so much attention has been paid to it is disturbing, to the extent that textbooks and other treatments of it have given the impression that the model is valid and widely accepted, when clearly it is not” (p.

60).

This is unfortunate because the spirit of the Structure of Intellect has lived on to be more than a cautionary tale about construct measurement and model fit. Guilford’s work in this area would inspire educational psychologist E. Paul Torrance to base his Tests of Creativity on task dimensions and scoring criteria highlighted in Guilford’s Structure of Intellect (SOI) model

(1965;1974). To date, the Torrance tests have been administered hundreds of millions of times and translated into 50 languages in public and private sectors. Despite extensive criticism over the years from the intelligence research community, creativity research remains grounded in the divergent and convergent thinking abilities outlined in Guilford’s Structure of Intellect model

(1956; 1967).

11 Figure 3. Widely used task dimensions and scoring criteria for divergent thinking tasks based on the Guilford’s divergent production dimension in the Structure of Intellect.

A recurrent problem in creativity research is the way tasks are scored. When researchers score subjects on the number of responses they can generate on a task to a given prompt, it’s not surprising that scores of fluency, flexibility, and originality would be highly correlated (Figure 3 includes common task dimensions and scoring criteria). For this reason, many researchers have argued that these scoring procedures are inherently confounded because they all depend on fluency in some way (the number of responses) (May & Metcalf, 1965; Hocevar, 1978; 1979,

Runco, 1985). While many researchers acknowledge the creativity-fluency confound, few have addressed the issue in a theoretically meaningful way.

Fluency-based scoring rules cannot be easily applied to tasks that require creative responses in the spatial or figural domains, and only a few studies have used non-verbal forms of divergent thinking tasks. In fact, these tasks are routinely excluded from studies that use the

Torrance Test of Creativity because they do not conform to traditional scoring rules and present an added level of complexity for raters (Kim, 2008). It’s important to note that out of the 400

12 creativity tasks included in her analysis, only 30 were non-verbal despite equal availability to researchers and similar instructions for subjects.

Cognition and Creative Thinking

Despite the obvious communality between creativity and problem-solving activities

(Newell & Simon, 1972; Newell, 1980; Sweller, Polson & Jeffries, 1985), the creativity literature hasn’t demonstrated a clear link between creativity and memory or attention processes. What we do understand is that creativity likely depends on cognitive faculties that aren’t “intrinsically creative” (Perkins, 1983, Carroll, 1993). Over the years, our lab has been devoted to investigating how working memory capacity and attention control facilitate higher order abilities such as fluid intelligence (Engle, 2002; Kane, et al., 2004). Research on working memory capacity has advanced our theoretical understanding of fluid intelligence and has proven to be one of its strongest predictors (Conway et al., 2005). Further, researchers have identified several lower order processes important to the working memory system (e.g., controlled attention, resistance to interference, and mind-wandering). Based on this work we’ve argued that the predictive validity of working memory capacity is driven by attention control (Engle, 2001,

2002; Engle, Kane, & Tuholski, Engle, Tuholski et al., 1999; Kane & Engle, 2004). Specifically, controlled processing accounts for the most variance in fluid intelligence (see Kane et al., 2004;

Shipstead, Harrison, & Engle, 2016). In the current study, we use this theoretical base to explore the relationship between creativity and fluid intelligence. Further, we test if working memory capacity and attention control account for this relationship.

Verbal fluency tasks are used to test a wide range of executive functions by researchers and clinicians. Verbal fluency reflects inhibition (Kane & Engle, 2000), monitoring of memory

(Rosen & Engle, 1997), and retrieval ability (Martin et al., 1994; Troyer, Moscovitch &

13 Winocur, 1997). It's safe to say that the results, accumulated over decades, provide evidence for more than a single mental ability. Research has shown that verbal fluency involves a network of cognitive abilities including working memory capacity, processing speed, and uniquely, fluid intelligence (Unsworth et al., 2011; Shipstead et al., 2014). This line of research suggests that there should be a relationship between creativity as measured by fluency and working memory capacity at the latent construct level.

Interestingly, differences between individuals with high and low working memory capacity are visible within a minute of retrieval on verbal fluency tasks (Rosen & Engle, 1997) and are not merely influenced by the number of animal names that high working memory span subjects know (Unsworth, Spillers, & Brewer, 2011). This relationship between working memory capacity and verbal fluency, along with the attentional demands of performing tasks of verbal fluency (Rosen & Engle, 1999; Unsworth et al., 2011), further indicate that working memory capacity and attention control are involved in creative performance.

Convergent thinking tests share many characteristics with traditional measures of reasoning. For instance, tests of fluid intelligence also present subjects with well-defined problems which can only be solved with a single solution. The Number Series test presents subjects with a string of numbers such as, “2, 4, 6, 8, ?.” The goal of each item is to determine the underlying rule which predicts the next number in the sequence. In this case, the rule is to simply add 2 to the previous number, making the correct solution “10.” The primary difference is that fluid intelligence items are multiple choice. Kane and colleagues (2004) found that the remote associates task had a substantial loading on a factor of general reasoning (gF). Working memory capacity was highly predictive of gF (around 0.6). A factor of executive attention also predicted gF in this model at 0.52. Paulewicz, Chuderski, and Necka (2007) found that fluid

14 intelligence was predictive of a general insight ability comprised of several typical insight indicators (e.g., riddles, object manipulation, geometrical and math problems) and 24 remote associates items (0.87).

Ultimately, overall creative performance should depend on how well the subject can understand and attend to the goals of the task. The creative process itself reflects how the subject navigates the problem space to reach a novel solution (e.g., generating and testing hypotheses, searching for alternative solutions with minimal re-retrieval of discounted hypotheses). Attention control governs our abilities to resist interference, resolve competition, and to maintain goal relevant information. Several researchers have proposed that performance on divergent thinking tasks in particular should be related to attention control (Martindale, 2003; Dorfman &

Gassimova, 2016). In particular, Martindale (1989, 2002) observed that the instruction to “be creative” on ideation tasks slowed down information processing. Notably, individuals with high working memory capacity had faster reaction times on creativity tests. Investigations of sleep deprivation and working memory capacity in expert pilots show that fatigue impairs working memory capacity (Lopez, et al., 2012). Similarly, sleep deprivation also negatively impacts divergent thinking performance in contrast to rule-based complex tasks such as tests of convergent thinking abilities (Horne, 1988; Harrison & Horne, 2000).

Prior Psychometric Work on Creativity

Researchers investigating the processes that facilitate performance on measures of divergent thinking have found evidence for the role of broad retrieval ability (Silvia et al., 2013) and fluid intelligence (Silvia, 2008; Nusbaum & Silvia, 2011a; Benedek et al., 2012). Although work by Silvia and colleagues is interesting, the relationship between creativity and working memory capacity remains unclear. Although a recent publication by Lee et al., (2014) attempted

15 to shed light on the role of working memory capacity, executive functions, and creativity, the study suffers from several methodological shortcomings. Lee et al., 2014 modeled relationships among intelligence, working memory capacity, and divergent and convergent measures of creativity using structural equation modeling. However, a closer review complicates meaningful interpretation. First, creativity scores on the Torrance Test of Creativity were given by a single rater. The authors claim that the reliability of this test is near 1.0. This is often cited as ample evidence of the merits of a particular test, but even exceptionally high reliabilities can fail to translate to validity (Bollen, 1990). Although the lack of any reliability estimates is problematic, a larger problem is found when conducting a reanalysis of their data. We performed a confirmatory factor analysis (CFA) including all factors reported in the paper. A factor of convergent thinking correlated with other constructs above 1.0, indicative of model misspecification and/or non-normal data. Fit statistics suggested that the CFA specified was a poor fit to the actual underling data (CFI = .79, NNFI = .74, RMSEA = .08, SRMR = .07) (Lee et al., 2014).

Research Contributions

The study of creativity is based on the shared idea that it is (or should be) socially valuable and useful. Its value to our productivity is undoubtable, but researchers have spent far more time investigating how problem solving, intelligence, and creativity are different than investigating their similarities. However, we argue that all effortful activities require cognitive control, especially tasks that ask subjects to solve problems in new and interesting ways. To this end, we investigate if a general ‘problem-solving’ or ‘reasoning’ factor emerges from the constructs of fluid intelligence, creativity, and verbal fluency.

16 Working memory capacity has advanced our theoretical understanding of the underlying cognitive processes that drive individual differences in fluid intelligence (Engle, 2002; Kane, et al., 2004). We us this research to help us understand the component processes of divergent thinking, convergent thinking, and verbal fluency, three classic perspectives considered in creativity research.

Method

Data were collected from students at the Georgia Institute of Technology, community members in Atlanta, Georgia, and participants at Indiana University Purdue-Columbus (IUPC) in

Columbus, Indiana. Subjects were between 18 and 30 years of age. University subjects received partial credit for a course or were paid for participation. Community subjects received monetary compensation. Participants were run in groups of 1-5. All tasks were administered on a computer. However, some tasks required responses to be written down on a sheet of paper provided by the experimenter (Torrance test of creativity and verbal fluency). The current study includes data from 445 subjects.

Procedure

We developed a subjective scoring procedure that placed a high emphasis on scoring original responses. Three raters were told to evaluate only those responses that evoked a subjective impression of creativity. This way each rater only considered creative responses when making their judgments. Raters then assigned scores of 1-3 on four specific attributes that best classified the reason the response was creative (humor, insight, irony, knowledge; see Appendix section for scoring procedure). We rated creativity with a common rule set for both verbal and spatial tests of creativity. This allowed for domain general measurement where the common variance between these constructs could represent a factor of creativity that could not be obtained

17 through fluency-based scoring. Our newly developed procedure allows raters to quickly identify and describe creative figural responses, making the overall creativity score more robust by streamlining the scoring process across all three tests of divergent thinking found in the

Abbreviated Torrance Test for Adults (ATTA). We researched best practices in qualitative scoring of creative responses and adopted many of the recommendations laid out in the consensual assessment technique (Amabile & Hogan, 1982; Hennessey & Amabile, 1988).

Task Descriptions

Divergent thinking - Torrance Test of Creativity (Goff & Torrance, 2002)

A secondary goal of this study was to determine whether a single factor would converge across spatial and verbal domains of divergent thinking, we selected one of the most popular tests used to assess creativity (Kim et al., 2005) – the Abbreviated Torrance Test for Adults

(ATTA) (Goff & Torrance, 2002). This test is a shortened version of the Torrance Test of

Creative Thinking (TTCT) that takes 15 minutes to administer and consists of one verbal and two figural divergent thinking tasks. Descriptions and images of each task are included below.

These tasks are nearly identical to earlier paradigms used to assess divergent thinking and reasoning (Guilford, 1957, Guilford, Frick, Christensen, & Merrifield, 1957).

18 The verbal task asks the subject to generate as many responses as possible to the question: “imagine what problems would arise if you could walk on air or fly” (Figure 4).

Figure 4. Activity 1 “Consequences” from the Abbreviated Torrance Test for Adults

The figural tasks were “Picture Completion” and “Picture Construction,” and required the subject to use abstract shapes to create an “unusual” and “interesting” picture. In the “Picture

Completion” task subjects received the written instruction to: “Use the incomplete figures below to make some pictures. Try to make your pictures unusual. Your pictures should communicate as interesting and as complete as story as possible.” (Figure 5). For “Picture Construction,” subjects are instructed to: “See how many objects or pictures you can make from the triangles below, just as you did with the incomplete figures. Remember to create titles for pictures.” (Figure 6).

19 Figure 5. Activity 2 “Picture Completion” from the Abbreviated Torrance Test for Adults.

Figure 6. Activity3 “Picture Construction” from the Abbreviated Torrance Test for Adults.

Convergent thinking – Remote Associates Test (RAT)

20 The RAT (Mednick, 1962; Mednick & Mednick, 1967) is a measure of convergent thinking that requires participants to find a solution associated with a presentation of three word cues (Figure 7). This test is based on Mednick’s associative theory of creativity explained as “the forming of associative elements into new combinations which either meet specific requirements or are in some way useful” (see Figure 4, adapted from Mednick, 1962). In his theory, Mednick maintained that creative individuals have flatter associative hierarchies. He proposed that they were better at making remote or distant connections. As an example, a creative thinker would retrieve more remote associations to a prompt of “table” than a less creative thinker would. The creative individual is more likely to retrieve a more distant association (e.g. “leg”) while a less creative individual would have a steep associative hierarchy where the cue triggers responses with higher associative strength (e.g. “chair” or “cloth”).

Figure 7. Adapted from “The associative basis of the creative process,” by Mednick, S. A., 1962, Psychological Review, 3, 220–232. Copyright by the American Psychological Association.

21 The Remote Associates Test has demonstrated its psychometric utility above other measures of

CT. For instance, other common paradigms in use that reflect convergent ability (i.e., 9 dots problem) have demonstrated poor reliability (Chuderski, 2014).

Verbal fluency

These tasks are based on Thurstone's Word Fluency Test, a component of his Primary

Mental Abilities Test (Thurstone, 1938; Thurstone & Thurstone & Thurstone, 1949). In this experiment, three verbal fluency tasks were used. All three followed the same general procedure.

Two of the tasks were semantic fluency tasks in which subjects were asked to list as many names in a given category as they could – jobs or animal names. We also administered a letter fluency task, often referred to as phonemic fluency. On this task, the subject lists as many words that start with the letter “c” as possible. Test-takers were given a sheet of paper. The computer provided them with a category. The test-takers then spent two minutes writing as many category exemplars as possible. The end of the tasks was signaled by a beep played via headphones. The dependent variable was the number of unique, cue-relevant, exemplars produced.

Working memory capacity

The Automated Operation Span (Unsworth et al., 2005) requires the subject to first complete a practice procedure in which they answer a series of simple math operations (1x2 + 1

= ?); after the math practice, subjects’ maximum time allotted to solve the math problems on the real trials is calculated by their mean reaction time plus 2.5 standard deviations. Subjects also perform a practice procedure where they are presented with two letters and are required to recall them in the order they were presented. After the practice phase, subjects are presented with the real trials that combine the math and letter procedures of the experiment. Subjects are presented

22 with a list of 15 trials of 3-7 randomized letters interleaved with simple math operations. After each list is complete, subjects are required to recall the letters in the order presented.

The Symmetry Span (Unsworth, Redick, Heitz, Broadway, & Engle, 2009) task is a spatial version of the complex span which requires the subject to judge whether a picture is symmetrical while remembering 2-5 specific locations highlighted on a 4x4 grid. We also administered the

Reading Span task. Subjects read several sentences that are followed by a letter that they are asked to remember. After several iterations of this, subjects are asked to recall the letters seen during the trial in correct serial order. The subject is evaluated on the total number of letters recalled.

In the rotation-span task, the subject sees a letter that is rotated to one of eight angles on the computer screen (i.e., the letter “f” is turned clockwise on its side). After, subjects saw the statement “This letter is facing in the normal direction” and must judge whether the letter is in the normal direction or mirror-reversed. Following the rotation judgement, subjects see either a long or short arrow pointing to one of eight directions. After several of these letter and arrow presentations, subjects see a recall screen with 8 large arrows and 8 eight small arrows. They were asked to click on the arrows in the order in which they had seen them. There were three to ten arrows presented during each trial.

23 Figure 8. Examples of the (a) operation span where letters are interleaved with simple math problems, (b) and Symmetry span where subjects make symmetry judgments and remember correct position of each highlighted square (Harrison et al., 2013). Fluid intelligence

Three fluid intelligence tasks were administered including the Ravens Advanced

Progressive Matricies (RAPM) (Raven, 1990), the Number Series (Thurstone, 1938), and the

Letter Sets task (Ekstrom, French, Harman, & Dermen, 1976). In the RAPM, a matrix reasoning task, subjects are shown a 3x3 grid of images with the bottom right image missing. Subjects were asked to choose the image that logically belongs in the bottom right of the grid from a list of available options. In the Number Series task subjects are shown a series of numbers that follow a logical rule. The subject’s task is to determine the next number in the sequence and to choose it from a set of available options. The Letter Sets task presents the subject with groups of letters

(e.g., AAAA, BBBB, CCCC, and ZXYB) Participants must select the group of letters that does not follow the same rule that governs the other sets of letters.

Visual Arrays

24 Three versions of the Visual Arrays task paradigm (Luck & Vogel, 1997) were administered. Subjects begin with the brief presentation of an arrangement of simple shapes such as colored squares. On each trial, subjects were asked to indicate whether or not a specific aspect of the object has changed, relative to its initial presentation (e.g., has the box’s color changed?).

Visual arrays - color change (Luck & Vogel, 1997). Subjects are given a brief presentation of an array of 4, 6, or 8 colored boxes of different colors including white, black, yellow, green, blue, and purple. Subjects go through 28 trials of each array size, half of these are change-trials. After a short delay, the shapes reappear with one item encircled. Subjects then indicate if the color of the highlighted box has changed.

Visual arrays - orientation change (Luck & Vogel, 1997). Subjects go through 40 trials where arrays of 5 or 7 colored bars, either red or blue, are shown at horizontal, vertical, or 45° angles. Similar to the color change task, subjects indicate with a key press if the bars have changed positions (e.g., horizontal bars change to vertical).

Visual arrays - selective orientation (Vogel, McCollough, & Machizawa, 2005). In this task, subjects are told to focus on the either the red or blue bars and to expect an orientation change. 10 or 14 bars were shown. After a delay, only the to-be-remembered bars are shown on the screen. A white dot was overlaid on one bar to indicate that the subject needed to make a judgment on the particular bar (i.e. had the bar changed orientation?). We ran 40 trials at each set size, and half of these were change-trials.

25

Figure 9. The Visual Arrays tasks (adapted from Shipstead et al., 2014). In (a) subjects judge whether an encircled box has changed color since its initial presentation. (b) The subject indicates whether the position of any of the boxes has changed since the initial presentation (i.e., changed orientation from vertical to horizontal). (c) The subject judges whether the box with a white dot has changed position.

Attention Control

Antisaccade - In 1978, Hallett and Lightstone discovered that if a cued area is highlighted briefly during a saccade, subjects’ eyes will complete the saccade, and then divert to the target area (1978). The antisaccade task paradigm is used to measure a subject’s ability to resist a proponent response and is thought to tap attention more generally.

We used two versions of this task (adapted from Roberts et al., 1994). In the traditional

Antisaccade task, each trial started with a "***" fixation-point on a computer screen that lasted for approximately 200-1800 milliseconds. This was followed by a blank screen for 50 milliseconds, that was immediately followed by a "=" symbol that flashed twice over a period of

300 milliseconds on either the right or left hand side of the screen. After the flash, a letter was

26 presented on the opposite side of the screen for 100 milliseconds and masked by the number "8".

Test-takers were given 10,000 milliseconds to recall whether the masked letter had been "R",

"B", or "P". The dependent variable was accuracy across 60 trials.

In the Antisaccade-Beep task, subjects are asked to report a letter that is shown briefly on one side of a computer screen. While the letter is being presented, an attention-grabbing flash appears in their peripheral vision (on the opposite side of the screen). The task starts with a “+” symbol on a computer screen that is shown for 1,000 or 2,000 milliseconds. Subjects wear headsets so that they can hear a warning “beep” that signals a new trial was about to begin. The beep was included to facilitate bottom-up focus on the task. The “beep” was followed by a “*” symbol that flashes on the left or right hand side of the screen for 300 milliseconds. At the same time, a letter “O” or “Q” is flashing on the opposite side of the screen for 100 milliseconds.

Subjects have 5,000 milliseconds to report the letters they saw. The dependent variable was accuracy during the course of 48 trials.

The Stroop Task.

This task required test-takers to report the hue in which a color-word was presented (e.g., hue: red; word: "BLUE"). Blue, green, and red were used. Participants responded by pressing one of three keys with colored stickers. The task included 162 trials. On 54 trials the word and hue were congruent. On 54 trials the word and hue were incongruent. To increase the role of endogenous attention control an additional 54 congruent filler trials were included (see Kane &

Engle, 2003). The dependent variable was created by combining response time (RT) and accuracy for congruent and incongruent trials using the binning method of Hughes et al. (201 4).

The details of this method can be found in the section titled "Dependent variable for response time tasks".

27 The Arrow Flanker

This task (Eriksen & Schultz, 1979) required test-takers to report the direction in which a central arrow was pointing. Flanking characters included congruent arrows (e.g., → → → →

→), incongruent arrows (e.g., ← ← → ← ←), or a neutral arrow (e.g., ─ ─ → ─ ─). In total, 72 congruent, 72 incongruent, and 72 neutral trials were run.

A second flanker task was administered, but it was taken out of analyses due to low accuracy on neutral trials (< 70%). This task was based on Lavie's (1995) low-perceptual load condition was run. The binning procedure was used to score tasks that included reaction time (i.e., Stroop,

Arrow Flankers).

1Recent studies have demonstrated an increase in validity and reliability using this method instead of traditional difference scores (which are notoriously unreliable). The binning procedure is sensitive to a subject’s speed-accuracy trade-off, for instance, a subject’s tendency to sacrifice speed to be more accurate (Hughes et al., 2014; Draheim, Hicks, & Engle, 2015).

Results

The current article offers new insights to our discussion of creativity and its’ measurement. This meant overcoming difficult problems in measurement than prior work has been unable to deal with. Our assessment of divergent thinking considers the quality and originality of responses with no emphasis on the quantity of ideas. Stronger construct measurement will help us understand the nature of creativity and the component processes that govern creative production/generation.

Further, the current dataset allowed us to model separate latent variables for divergent thinking, convergent thinking, and verbal fluency. As Thurstone discovered (1938), almost all complex and cognitively demanding tasks engage numerous higher-order cognitive systems.

Carroll’s extensive meta-analysis of the factor analytic work conducted over 60 years replicated

28 this finding countless times (1993). So, how should researchers assess the reliability of multidimensional constructs? Emerging evidence by Reise (2012) and colleagues (Reise,

Bonifay, & Haviland, 2013) advocate the use of bifactor and hierarchical models (similar to the techniques Carroll relied on in his 1993 book on Human Abilities).

Analyses

In this analysis, we use multiple raters. Each rater scores three divergent thinking tasks using a criteria-based rating procedure. When using multi-trait, multi-method approaches, it's important that measurement is comparable across all three raters (measurement invariance).

Variation in the ways each rater measures the construct (DT) could make our results difficult to interpret (Meredith, 1993). We ensure method invariance in several ways. First, using CFA, we can study if different methods (raters) evaluate and rank subjects in the same way. Our CFA shows that each DT indicator loads onto a common factor, which provides some evidence of convergent validity among our raters.

The vast majority of research on divergent thinking has relied on a single paradigm, with no more than two indicators (Silvia 2008, Silvia et al., 2009, Nusbaum & Silvia, 2011; Silvia &

Beaty, 2012; Silvia, Beaty & Nusbaum, 2013; Lee & Therriault 2013). Several studies have then used these two indicators to derive multiple factors from each task. For example, many studies conducted by Silvia and colleagues gave subjects a 2-minute task asking them to write down unusual uses for a brick. After 2 minutes, subjects were then given 2 minutes to write down as many unusual uses for a knife as they could in 2 minutes. Raters were then instructed to count the total number of responses each subject generated as their fluency ability, and were then asked to rate the responses for their overall level of creativity (Silvia, 2008).

29 Unfortunately, the more scores you derive from each indicator, the lower the reliability estimates become for each of those score (Anastasi, 1967; Embretson & Reise, 2000). Isolating a single, unitary factor is difficult when measuring broad cognitive abilities and traits. For this reason, most cognitive abilities are considered to be multidimensional because they contain items that represent different domains and/or items that load on more than one factor (Widhiarso &

Ravand, 2014).

We used EQS 6.3 for all analyses. Descriptive statistics for each task are included in

Table 1. Our correlation matrix in Table 2. In order to maximize the power of our study we handled missing data in EQS 6.2 by applying the Expectation-Maximization algorithm (Bentler

& Wu, 2005). Less than 5% of the current dataset contained missing values, which were due to computer and/or experimenter errors.

30 Table 1. Descriptive Statistics for Each Task. Variable Min Max M SD Sk Kurtosis Ospan 3.00 75.00 53.74 15.45 -0.80 -0.08 Rspan 0.00 42.00 24.73 9.73 -0.41 -0.61 Sspan 3.00 42.00 26.35 9.16 -0.40 -0.57 LNS 0.00 23.00 10.84 4.14 0.00 -0.03 Raven 0.00 18.00 8.64 3.91 -0.02 -0.91 LS 1.00 29.00 15.26 5.47 0.02 -0.71 NS 0.00 15.00 8.52 3.58 -0.19 -0.88 VA1 -1.00 5.76 3.35 1.48 -0.81 0.02 VA2 -3.04 5.77 2.64 1.60 -0.81 0.89 VA4 -1.68 5.00 1.59 1.34 0.13 -0.57 Beepsacc 0.29 1.00 0.79 0.17 -0.71 -0.52 Antisacc 0.20 0.98 0.55 0.65 0.16 -0.77 Stroop -152.63 479.20 135.98 101.33 0.68 1.05 AF -49.97 260.85 99.82 47.12 0.75 1.37 VF1 6.00 41.00 21.79 7.28 0.02 -0.82 VF2 4.00 36.00 20.15 5.96 0.17 -0.18 VF3 5.00 34.00 19.24 5.91 0.07 -0.36 RATodd 0.00 5.00 2.56 1.33 -0.09 -0.61 RATeven -0.13 4.00 0.71 0.90 1.24 1.09 DT Rater 1 0.00 70.00 10.12 11.48 1.89 4.60 DT Rater 2 0.00 46.00 13.16 10.05 0.74 0.01 DT Rater 3 0.00 52.00 13.17 9.58 0.80 0.68 Note. Ospan = Automated Operation Span, Sspan = Symmetry Span; Rspan = Rotation Span; LNS = Letter Number Sequencing; Raven = Raven's Advanced Progressive Matrices; LS = Letter Sets; NS = Number Series; VA1 = Visual Arrays, color judgement; VA2 = Visual Arrays, orientation judgement; VA4 = Visual Arrays, selective orientation judgement; Beepsacc = Beep Saccade; Antisacc = Antisaccade; AF = Arrow Flanker; VF1 = Verbal Fluency, animal names; VF2 = Verbal Fluency, letter "c"; VF3 = Verbal Fluency, jobs; RAT = Remote Associate Test (odd and even items); DT = Abbreviated Torrance Test for Adults (ATTA), values reported for all three raters.

Since the primary statistical procedure used in this study is structural equation modeling

(SEM), it is important to review the best methods for evaluating the models presented in this study. There are five primary fit indices that readers should be able to interpret. To determine

31 how well the model reproduces the correlation matrix, researchers use the root mean square error of approximation (RMSEA). The standardized root mean square residual (SRMR) estimates how well the model represents the unstandardized covariance matrix (i.e., ≤ 0.06 for SRMR and ≤

0.08 for RMSEA (Hu & Bentler, 1999). Ideally, these values should not exceed 0.05 when your samples size exceeds 250 subjects.

Another frequently cited index for reporting model fit in SEM is the chi-square, a goodness of fit test. However, it’s important to note that when your sample size is large, the chi- square will likely be significant. To correct for sensitivity to sample size, we use another statistic that is typically reported is the chi square divided by degrees of freedom (chi square/df), where values less than 3 are considered favorable. Kline (2011) argues that values < 3 are favorable, but values closer to 2 are preferred. Most researchers recommend values of 0.95 and above on the comparative fit index (CFI) and non-normed fit index (NNFI) (see Hu and Bentler, 1999). These indices compare the model to a null model that assumes no correlations between observed variables. Researchers include the NNFI to address model complexity. More parsimonious models are favored. Last, we used the Akaike information criterion (AIC) to compare models, where the model with with the smallest AIC value represents the best statistical fit.

Structural Model

To understand the nature of creativity, we use a structural model to examine relationships between factors related to creativity, intelligence, working memory capacity, and attention. All indicators were submitted to a confirmatory factor analysis (CFA) specifying seven latent factors that were all assumed to be correlated (Table 7). All fit statistics indicate that the specified model was a good fit to the observed data (see Table 8).

Table 7.

32 Factor correlations for CFA WMC gF VA Attention VF CT DT WMC - gF 0.86 - VA 0.78 0.87 - Attention 0.78 0.83 0.80 - VF 0.57 0.79 0.62 0.60 - CT 0.43 0.56 0.47 0.49 0.58 - DT 0.58 0.72 0.56 0.55 0.71 0.45 - Note. WMC = Working Memory Capacity (a factor comprised of the Complex Span: Operation Span, Rotation Span, and Symmetry Span tasks; gF = Fluid Intelligence 3 tasks: Raven's Advanced Progressive Matrices, Letter Sets, Number Series); VA = Visual Arrays (3 tasks: color, orientation, and selective orientation judgements); Attention = a factor consisting of Beep Saccade, Antisaccade, and Stroop; VF = Verbal Fluency, two categorical fluency tasks, one phonemic (letter) fluency task); CT = Convergent Thinking, Remote Associates Test (odd and even); DT = Divergent Thinking, Abbreviated Torrance Test for Adults (ATTA) across three raters).

Table 8. Fit statistics for confirmatory factor analysis. χ² df χ²/df RMSEA SRMR NNFI CFI 288.36 168 1.72 0.04 0.03 0.96 0.97

33 1 3 R 1 6 6 . 0 2 R 1 9 5 4 3 . . 0 0 1 R 1 0 3 5 1 2 2 . . . N 0 0 0 E V E T A R 1 7 2 7 8 D 4 2 2 2 . . . . D 0 0 0 0 O T A R 1 8 8 9 5 6 2 2 2 4 3 . . . . . 0 0 0 0 0 3 F V 1 8 4 0 2 6 3 5 3 3 3 4 4 ...... 0 0 0 0 0 0 2 F V 1 1 6 7 0 3 3 6 6 6 3 3 3 5 4 ...... 0 0 0 0 0 0 0 1 F V 1 0 7 8 3 4 6 7 7 6 4 4 3 3 2 4 4 ...... 0 0 0 0 0 0 0 0 S N 1 8 6 9 8 5 6 6 8 6 6 5 4 4 3 2 2 4 4 ...... 0 0 0 0 0 0 0 0 0 S L 1 3 7 4 3 5 0 8 9 5 3 S 6 6 5 4 4 3 2 2 5 5 ...... N 0 0 0 0 0 0 0 0 0 0 E V A R 1 9 6 5 5 6 4 7 6 9 9 2 5 5 5 4 3 3 2 2 1 3 4 ...... 0 0 0 0 0 0 0 0 0 0 0 4 A V 1 8 4 3 8 6 0 8 5 0 0 3 6 5 5 4 4 3 3 2 1 2 2 3 3 ...... 0 0 0 0 0 0 0 0 0 0 0 0 2 A V 1 1 1 7 6 0 6 0 6 1 0 0 6 4 6 6 5 5 6 4 4 3 3 3 2 3 3 ...... 0 0 0 0 0 0 0 0 0 0 0 0 0 1 A V 1 8 4 4 4 2 6 6 1 0 2 8 4 0 6 4 4 5 5 5 5 3 3 3 2 1 2 4 3 ...... N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A P S M Y S 1 8 9 4 4 1 8 6 1 6 3 5 2 0 8 8 6 4 4 5 6 5 5 4 3 3 2 2 2 3 3 ...... N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A P S T O R 1 1 2 4 6 5 0 6 4 9 1 3 0 5 0 6 6 5 5 4 3 4 5 4 5 3 4 3 3 2 2 3 3 ...... N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A P S O 1 9 9 9 7 9 2 7 4 5 3 8 7 1 9 0 8 8 3 4 4 4 3 5 4 5 5 4 3 3 3 2 2 3 3 ...... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C C A S I T N A 1 2 9 3 1 8 5 3 4 5 2 8 6 6 7 0 3 6 4 6 3 5 5 4 4 5 5 5 5 3 3 3 2 2 2 3 3 ...... C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C A S P E E B 8 4 3 0 7 2 6 0 1 6 6 2 4 2 7 9 7 7 1 2 3 2 1 2 2 3 3 3 3 2 2 1 1 1 1 1 1 1 2 ...... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ------F A 1 0 7 0 3 0 9 6 0 4 7 3 2 4 4 8 1 4 9 4 8 x P 2 i 2 1 2 1 2 2 1 1 2 1 2 1 0 0 0 1 0 1 0 ...... r t O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ------a O M R

T n S o i t a l e r r C o C N N C C C

S D E P . A A A N O 2 V O P N S S N E I S e O E P O A l A 1 T 2 4 T V T T E 2 P 1 3 R b P F F N S S F F A E O A A A A A 1 2 3 a S T S T S A B A O R S V V V R L N V V V R R R R R

One- and Two-factor Models

34 What is the relationship between divergent thinking and fluid intelligence?

One of the primary questions in this study was to determine the magnitude of the relationship between divergent thinking and fluid intelligence. Some researchers have argued that they are isomorphic, others argue that they should not be correlated at all. Three indicators of fluid intelligence and three raters of divergent thinking were loaded onto a single latent construct to determine if fluid intelligence and divergent thinking are unitary or separable. If divergent thinking and fluid intelligence represented the same abilities, we would expect the model to fit better than the two-factor model where fluid intelligence is loaded on one construct and divergent thinking on another (Figure 10). In the CFA in Figure 10, we observed additional variance in fluid intelligence that was unaccounted for by the one-factor model. Similarly, loading both the divergent thinking and verbal fluency indicators onto a single factor resulted in poor fit. The fit statistics for models A and B are reported in Table 3 and Table 4. The two-factor model has superior fit when compared to the single factor model.

35 Figure 10. A comparison of confirmatory factor analysis models. A) Left panel – a one-factor model where fluid intelligence tasks (Raven's Advanced Progressive Matrices, Letter Sets, Number Series) and divergent thinking indicators (Raters 1,2, and 3) are loaded onto one factor labeled CPS (Creative problem-solving). B) Right panel – a two-factor model where fluid intelligence and divergent thinking load on separate factors. A latent-variable correlation of .73 was observed between the fluid intelligence and divergent thinking factors.

The one-factor model had poor fit χ2(9) = 148.95; NNFI = 0.80; CFI = 0.88; SRMR =

0.07; RMSEA = 0.19; AIC = x. A two-factor model had better fit to the data χ2(8) = 22.80;

NNFI = 0.97; CFI = 0.99; SRMR = 0.03; RMSEA = 0.06; AIC = x.

Table 3. Fit statistics for Model 10A. χ² df χ²/df RMSEA SRMR NNFI CFI 148.95 9.00 16.55 0.187 0.073 0.80 0.88

Table 4. Fit statistics for Model 10B. χ² df χ²/df RMSEA SRMR NNFI CFI 22.80 8.00 2.85 0.06 0.03 36 0.97 0.99 What is the relationship between divergent thinking and verbal fluency?

It is important to understand if divergent thinking is separable from verbal fluency. We tested for discriminant validity in the confirmatory factor analysis in Figure 11. Similar to our previous factor models using fluid intelligence, loading both the divergent thinking and verbal fluency indicators onto a single factor resulted in poor fit.

The one-factor model had poor fit χ2(9) = 146.99; NNFI = 0.99; CFI = 0.99; SRMR =

0.02; RMSEA = 0.19; AIC = x (see Table 5). These results suggest that the model is parsimonious, but the high RMSEA suggests there is variance unaccounted for. The two-factor model had better fit χ2(9) = 13.41; NNFI = 0.99; CFI = 0.99; SRMR = 0.02; RMSEA = 0.04;

AIC = x (see Table 6). The two constructs are highly correlated (.7), but not unitary.

Table 5. Fit statistics for Model 11A. χ² df χ²/df RMSEA SRMR NNFI CFI 146.99 9.00 2.85 0.19 0.02 0.99 0.99 Table 6. Fit statistics for Model 11B. χ² df χ²/df RMSEA SRMR NNFI CFI 13.41 9.00 1.68 0.04 0.02 0.99 0.99 Figure 11. A comparison of confirmatory factor analysis models. A) Left panel – a one-factor model where three verbal fluency tasks (Animal Names, Listing Words that Begin with Letter “c” and Occupations) and divergent thinking indicators (Raters 1,2, and 3) are loaded onto one factor labeled ‘Fluency.’ B) Right panel – a two-factor model where fluid intelligence and divergent thinking load on separate factors. A latent-variable correlation of .70 was observed between the verbal fluency and divergent thinking factors.

What component processes underlie creative responses

Measurement Models: Bifactor Approach

Using a bifactor modeling approach, we cross-loaded "executive attention" or attention control onto factors of fluid intelligence, working memory capacity, and the visual arrays.

37 Executive attention is directly related to most measures of cognitive ability and exists independently of group (specific) factors. A general factor of executive attention shares unique variance with divergent thinking (.44), convergent thinking (.35), and verbal fluency (.69).

This approach accounts for unique variance above and beyond the variance accounted for by the general factor, in this case executive attention (Figure 13). The fit statistics were a good fit to the data and suggested no misspecifications (Table 9).

Figure 13. Structural Model 4. VA = Visual Arrays (3 tasks: color, orientation, and selective orientation judgements); Executive Attention = a factor consisting of Beep Saccade, Antisaccade, and Stroop; Executive Attention = gF = Fluid Intelligence (a factor comprised of: Raven's Advanced Progressive Matrices, Letter Sets, Number Series); DT = Divergent Thinking, Abbreviated Torrance Test for Adults (ATTA) across three raters); CT = Convergent Thinking, Remote Associates Test (odd and even).

38 Table 9. Fit statistics for Model 14. χ² df χ²/df RMSEA SRMR NNFI CFI AIC 3506.8 238.81 118.0 2.02 0.05 0.03 0.96 0.97 3 Many researchers have concluded that there is little more to creative responses than verbal fluency.

However, we and many other researchers believe that this is an oversimplified view. It neglects the wealth of problem-solving research that demonstrates that subjectively determined creativity or originality scores have resulted in some the most valid and real-world applicable findings to date (Hocevar, 1981; Feldhusen, 1995). Whereas objective scoring methods restrict the definition of creativity to responses that are statistically unique, regardless of its other qualities, subjective scoring allows multiple responses to be repeated in the sample across subjects and still be rated as creative. Likewise, Hocevar (1979a; 1981) found that when tests of divergent thinking are rated subjectively, divergent thinking tasks are more valid and reliable. Anastasi recommends reviewing the limitations of existing forms and considering qualitative scoring to identify the best ways to categorize performance (1967). Using subjective, criteria-based scoring, we reduced the confound between fluency and originality because each response was rated separately, valued on its own merit apart from other responses in the sample.

Therefore, our last analysis cross-loaded the verbal fluency factor onto the convergent and divergent thinking factors in order to determine whether fluency could fully account the relationship between Executive Attention and Fluid Intelligence to CT and DT.

39 Figure 14. Structural Model 4. VA = Visual Arrays (3 tasks: color, orientation, and selective orientation judgements); Attention = a factor consisting of Beep Saccade, Antisaccade, and Stroop; VF = Verbal Fluency, two categorical fluency tasks, one phonemic (letter) fluency task); gF = Fluid Intelligence (a factor comprised of: Raven's Advanced Progressive Matrices, Letter Sets, Number Series); DT = Divergent Thinking, Abbreviated Torrance Test for Adults (ATTA) across three raters); CT = Convergent Thinking, Remote Associates Test (odd and even).

Table 10. Fit statistics for Model 14. χ² df χ²/df RMSEA SRMR NNFI CFI AIC 297.03 164.00 1.81 0.04 0.03 0.96 0.97 4285.730

The results of the SEM indicated good model fit. Further, Executive Attention was uniquely related to CT, DT, and verbal fluency. Fluid Intelligence failed to predict CT or DT above and beyond Executive Attention when verbal fluency was allowed to cross-load onto both factors of creativity.

Discussion 40 Bifactor models can control for multidimensionality making them particularly fitting for the current study. A recent article by Beauiean (2015) reviews J.B. Carroll’s contributions to theory and method in the field of intelligence. The author concludes that Carroll would have preferred a bifactor modeling approach to the hierarchical model he presents in his large-scale study of cognitive abilities if computational power had permitted it in 1993 (Beaujean, 2015).

Carroll was careful to note that his hierarchical, three-stratum model wasn’t a simple tree structure, believing that intelligence is influenced by many independent factors and that psychometric g can be observed across all measures of cognitive abilities. In accord with this theory, we do not view multicollinearity in the present study as artifactual. A key characteristic of fluid intelligence and other multidimensional constructs such as creativity and verbal fluency is their correlation with other cognitive abilities. Problem-solving tasks tap a range of cognitive processes, and this can make discriminant validity a complex issue (Kane, Hambrick, &

Conway, 2005). However, our results suggest that while divergent thinking and verbal fluency are closely related, they are not unitary. Executive attention has a systematic association with other cognitive constructs. Effective allocation and control of attention is important for both creative and more conventional problem-solving. In the current study, a general factor of executive attention predicted unique variance with CT, DT, and verbal fluency.

Our work provides an alternative approach to conventional, fluency- or summary-based measurement of divergent thinking tasks that can be extended to other measures of ideation that require novel solutions to problems. This work is of relevance to professionals throughout the military, intelligence domains, particularly those in leadership roles or others that engage in high stakes decision-making. Our research examines the cognitive processes that drive idea generation on tasks of divergent thinking, convergent thinking, and verbal fluency. By developing a novel

41 scoring method that focuses on response quality, this research contributes new evidence that divergent thinking is more than the ability to be ‘fluent,’ a unique process that presents additional opportunities for the military to screen for and select individuals in addition to intelligence.

Related to this, we are interested in future work that explores the particular contexts or conditions that govern creative problem solving. For instance, experiments that establish thresholds to inform instruction or training in creative problem solving and production (e.g. using high penalties for a false alarm vs. high penalties for a miss) or inducing particular biases

(e.g. conservative or liberal biases to determine optimal conditions for decision-making). Some experimental research has looked at positive/negative affect and performance on creative problem solving tasks (CIT), but no research to our knowledge has looked at inducing these particular mindsets and their relation to decision-making.

A notable example is a 1998 study of US Army officers where researchers tested the predictive validity of divergent thinking tasks on several indicators of career performance as a means to better understand organizational leadership. Specifically, participants completed a series of modified Consequences tasks that represented creative problem solving. Raters used a consensual rating technique to evaluate participant responses on several attributes including quality (coherence, logic) and originality (novelty, ). The researchers found that divergent thinking abilities made a significant and unique contribution to subject’s ability to find solutions to novel, open-ended and ill-defined problems as well as other metrics related to real- world leadership performance in the military (Mumford et al., 1998) and in the sciences

(Mumford, et al., 1998, Mumford et al., 2010).

42 Studying creativity is challenging, resource-intensive work. Further, tests used to measure creativity and the methods used to evaluate them are far less reliable than most intelligence tests. An obvious concern for all studies of creativity is that its measurement is still unsettled in the creative research community. Validation begins with construct definition, and we haven’t reached a consensus on definitions of creativity or the best tasks to measure it.

Psychometric research does suggest that open-ended, ill-defined problem types are most representative of real-world creative problem-solving. However, open-ended tasks aren’t ideal in fMRI experiments where reliability requires multiple short trials (Kröger, 2015). Without any clear or compelling mechanistic accounts of creativity, it’s difficult to select appropriate control tasks. This makes neuroscientific studies focused on “creative cognition” particularly difficult to interpret (Dietrich, 2007). Abraham (2013) raises several critical limitations of neuroscientific study of creativity. He notes that it can be nearly impossible for researchers to determine the moment a subject thinks of a creative idea with any precision and discriminating brain activation from strategy use and activation observed during the actual moment of creation is understandably difficult.

43 References

Amabile, T. M. (1982). Social psychology of creativity: A consensual assessment technique. Journal Of And Social Psychology, 43(5), 997-1013. doi:10.1037/0022- 3514.43.5.997

Hennessey, B. A., & Amabile, T. M. (1988). The conditions of creativity. The nature of creativity, 22, 11-38.

Anastasi, Anne. "Psychology, Psychologists, and Psychological Testing. fI Amepican Psychotogist. 22: 297- 306, 1967.

Anastasi, A., & Schaefer, C. E. (1969). Biographical correlates of artistic and literary creativity in adolescent girls. Journal of Applied Psychology, 53(4), 267.

Benedek, M., Franz, F., Heene, M., & Neubauer, A. C. (2012). Differential effects of cognitive inhibition and intelligence on creativity. Personality and Individual Differences, 53(4), 480-485.

Christensen, P. R., Guilford, J. P., & Wilson, R. C. (1957). Relations of creative responses to

working time and instructions. Journal of experimental psychology, 53(2), 82.

Christensen, P. R., Guilford, J. P., Merrifield, P. R., & Wilson, R. C. (1960). Alternate uses.

Beverly Hills: Sheridan Psychological Services.

Carroll, J. B. (1974). Psychometric Tests as Cognitive Tasks: A New'Structure of Intellect' (No.

ETS-RB-74-16). Educational Testing Service, Princeton, NJ.

Chuderski, A. (2014). How well can storage capacity, executive control, and fluid reasoning

explain insight problem solving. Intelligence, 46, 258-270.

Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological

Bulletin, 52(4), 281-302. doi:10.1037/h0040957

44 Cronbach, L. (1968). Intelligence? Creativity? A Parsimonious Reinterpretation of the Wallach-

Kogan Data. American Educational Research Journal, 5(4), 491-511. Retrieved from

http://www.jstor.org/stable/1161995

Conway, A. R., Kane, M. J., Bunting, M. F., Hambrick, D. Z., Wilhelm, O., & Engle, R. W.

(2005). Working memory span tasks: A methodological review and user’s guide.

Psychonomic bulletin & review, 12(5), 769-786.

Embretson, S. E., & Reise, S. P. (2000). Item Response Theory. Mahwah, N.J.: Psychology Press.

Engle, R. W., Kane, M. J., & Tuholski, S. W. (1999). Individual differences in working memory capacity and what they tell us about controlled attention, general fluid intelligence, and functions of the prefrontal cortex.

Engle, R. W. (2002). Working memory capacity as executive attention. Current directions in psychological science, 11(1), 19-23.

Kane & Engle, 2004

Kane, M. J., Bleckley, K. M., Conway, A. R. A., & Engle, R. W. (2001). A controlled-attention view of working-memory capacity. Journal of Experimental Psychology: General, 130, 169–183.

Kane, M. J., & Engle, R. W. (2000). Working-memory capacity, proactive interference, and divided attention: Limits on long-term memory retrieval. Journal of Experimental Psychology: Learning, Memory, & Cognition, 26, 336–358

Kane, M. J., Hambrick, D. Z., Tuholski, S. W., Wilhelm, O., Payne, T. W., & Engle, R. W. (2004). The generality of working memory capacity: a latent-variable approach to verbal and visuospatial memory span and reasoning. Journal of Experimental Psychology: General, 133(2), 189.

Eysenck, H. J. (1994). The measurement of creativity. The MIT Press.

45 Frick, J. W., Guilford, J. P., Christensen, P. R., & Merrifield, P. R. (1959). A factor-analytic study of flexibility in thinking. Educational and Psychological Measurement, 19(4), 469- 496.

Goff, K. (2002). The abbreviated Torrance test for adults (ATTA). IL Scholastic Testing Service.

Guilford, J. P. (1956). The structure of intellect. Psychological bulletin, 53(4), 267.

Guilford, J. P. (1967). The nature of human intelligence.

Guilford, J. P. (1958). Can creativity be developed?. Art education, 3-18.

Guilford, J. P. (1950). Creativity. American Psychologist, 5, 444–454.

Guilford, J. P., Christensen, P. R., Merrifield, P. R., & Wilson, R. C. (1978). Alternate uses:

Manual of instructions and interpretations. Orange, CA: Sheridan Psychological Services.

Guilford, J. P. (1967). Creativity: Yesterday, today and tomorrow. The Journal of Creative Behavior, 1(1), 3-14.

Guilford, J. P., & Pandey, R. E. (1974). Abilities for divergent production of symbolic and semantic systems. Journal of General Psychology, 91, 209.

Guilford, J.P. (1965)

Hargreaves, H. L. (1927). The" faculty" of imagination. British Journal of Psychology.

Hocevar, D. (1979). Ideational fluency as a confounding factor in the measurement of originality. Journal of Educational Psychology, 71(2), 191.

Harris, C. W. and Liba, M. R. (1965). Component, image, and factor analysis of tests of intellect

and of motor performance. Cooperative Research Project No. S-192-64. Univ. of

Wisconsin.

Harrison, Y., & Horne, J. A. (2000). The impact of sleep deprivation on decision making: a

review. Journal of experimental psychology: Applied, 6(3), 236.

46 Hass, R.W. Mem Cogn (2017) 45: 233. doi:10.3758/s13421-016-0659-y

Hocevar, D. (1981). Measurement of creativity: Review and critique. Journal of Personality assessment, 45(5), 450-464.

Horn, J. L. (1976). Human abilities: A review of research and theory in the early 1970s. Annual review of psychology, 27(1), 437-485.

Horne, J. A. (1988). Sleep loss and “divergent” thinking ability. Sleep, 11(6), 528-536.

Jackson, P. W., & Messick, S. (1965). The person, the product, and the response: Conceptual problems in the assessment of creativity. Journal of personality, 33(3), 309-329.

Kline, R. B. (2005). Principles and practice of structural equation modeling. New York: Guilford Press

Lee, C. S., & Therriault, D. J. (2013). The cognitive underpinnings of creative thought: A latent variable analysis exploring the roles of intelligence and working memory in three creative thinking processes. Intelligence, 41(5), 306-320.

Ledoux, K., Vannorsdall, T. D., Pickett, E. J., Bosley, L. V., Gordon, B., & Schretlen, D. J. (2014). Capturing additional information about the organization of entries in the lexicon from verbal fluency productions. Journal of clinical and experimental neuropsychology, 36(2), 205-220.

Mayr, U. (2002). On the dissociation between clustering and switching in verbal fluency: Comment on Troyer, Moscovitch, Winocur, Alexander and Stuss. Neuropsychologia, 40(5), 562-566.

Martin, A., Wiggs, C. L., Lalonde, E, & Mack, C. (1994). Word retrieval to letter and semantic cues: A double dissociation in normal subjects using interference tasks. Neuropsychologia, 32,1487-1494.

Martindale, C. (1989). Personality, situation, and creativity. In Handbook of creativity (pp. 211- 232). Springer US.

Mednick, M. T., & Andrews, F. M. (1967). Creative thinking and level of intelligence. The Journal of Creative Behavior, 1(4), 428-431.

47 Mednick, S. A., & Mednick, M. T. (1967). Examiner's manual, Remote Associates Test: College

and adult forms 1 and 2. Houghton Mifflin.

Mednick, S. A. (1962). The associative basis of the creative process. Psychological Review, 3,

220–232.

Mednick, M. T, Mednick, S. A., & Mednick, E. V. (1964) Incubation of creative performance and specific associative priming. Journal of Abnormal and Social Psychology, 69, 84-88.

McCabe, D. P., Roediger III, H. L., McDaniel, M. A., Balota, D. A., & Hambrick, D. Z. (2010). The relationship between working memory capacity and executive functioning: evidence for a common executive attention construct. J Neuropsychology, 24(2), 222.

McNemar, Q. (1964). Lost: our intelligence? Why?. American Psychologist, 19(12), 871.

Mumford, M. D., & Gustafson, S. B. (1988). Creativity syndrome: Integration, application, and innovation. Psychological bulletin, 103(1), 27.

Mumford, M. D., Marks, M. A., Connelly, M. S., Zaccaro, S. J., & Johnson, J. F. (1998). Domain-based scoring in divergent-thinking tests: Validation evidence in an occupational sample. Creativity Research Journal, 11(2), 151-163.

Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis.Cognitive psychology, 41(1), 49-100.

Nusbaum, E. C., & Silvia, P. J. (2011). Are intelligence and creativity really so different?: Fluid intelligence, executive processes, and strategy use in divergent thinking. Intelligence, 39(1), 36-45.

Osborn, A. F. (1953). Applied imagination.

Parnes, S. J. (1967). Creative behavior guidebook. Scribner.

48 Pakhomov, S. V., & Hemmy, L. S. (2014). A computational linguistic measure of clustering behavior on semantic verbal fluency task predicts risk of future dementia in the Nun Study. Cortex, 55, 97-106.

Paulewicz, B., Chuderski, A., & Nęcka, E. (2007). Insight problem solving, fluid intelligence, and executive control: A structural equation modeling approach. In Proceedings of the 2nd European cognitive science conference(pp. 586-591).

Plucker, J. A., Qian, M., & Wang, S. (2011). Is originality in the eye of the beholder? Comparison of scoring techniques in the assessment of divergent thinking. Journal of Creative Behavior, 45(1), 1-22.

Rosen, V. M., & Engle, R. W. (1997).The role of working memory capacity in retrieval. Journal of Experimental Psychology: General, 126(3), 211.

Runco, M. A., & Albert, R. S. (1990). Theories of creativity. Sage Publications, Inc.

Runco, M. A. (1986). The discriminant validity of gifted children's divergent thinking test scores. Gifted Child Quarterly, 30(2), 78-82.

Silvia, P. J. (2007).

Silvia, P. J. (2008). Another look at creativity and intelligence: Exploring higher-order models and probable confounds. Personality and Individual differences,44(4), 1012-1021.

Silvia, P. J., Martin, C., & Nusbaum, E. C. (2009). A snapshot of creativity: Evaluating a quick and simple method for assessing divergent thinking. Thinking Skills and Creativity, 4(2), 79-85.

Silvia, P. J. (2011).

Silvia, P. J. (2013).

Simonton, D. K. (1997). Creative productivity: A predictive and explanatory model of career trajectories and landmarks. Psychological Review, 104, 66 – 89.

Taft, R., & Rossiter, J. R. (1966). The remote associates test: divergent or convergent thinking?. Psychological Reports, 19(3f), 1313-1314.

49 Thurstone, L. L. (1938). Primary mental abilities. Chicago: University of Chicago Press.

Torrance, E. P. (1974). Torrance tests of creative thinking: Directions guide and scoring manual. Massachusetts: Personal Press.

Troyer, A. K., Moscovitch, M., &Winocur, G. (1997). Clustering and switching as two components of verbal fluency: evidence from younger and older healthy adults. Neuropsychology, 11(1), 138.

Torrance, E. P. (1966). The Torrance Tests of Creative Thinking: Norms— technical manual. Lexington, MA: Personal Press.

Torrance, E. P. (1968). A longitudinal examination of the fourth grade slump in creativity. Gifted Child Quarterly, 12, 195–199.

Torrance, E. P. (1969). Prediction of adult creative achievement among high school seniors. Gifted Child Quarterly, 16, 75–88. Torrance, E. P. (1972). Predictive validity of the Torrance Tests of Creative Thinking. Journal of Creative Behaviour, 6, 236–252.

Torrance, E. P. (1974). The Torrance Tests of Creative Thinking: Norms— technical manual. Princeton, NJ: Personnel Press. Torrance, E. P. (2008). The Torrance Tests of Creative Thinking: Norms— technical manual. Bensenville, IL: Scholastic Testing Services.

Treffinger, D. J. (1995). Creative problem solving: Overview and educational implications. Educational Psychology Review, 7(3), 301–312.

Tulving, E., & Colotla, V. A. (1970). Free recall of trilingual lists. Cognitive Psychology, 1(1), 86-98.

Unsworth, N., & Engle, R. W. (2005). Working memory capacity and fluid abilities: Examining the correlation between operation span and Raven. Intelligence, 33, 67–81.

Unsworth, N., Redick, T. S., Heitz, R. P., Broadway, J. M., & Engle, R. W. (2009). Complex working memory span tasks and higher-order cognition: A latent-variable analysis of the relationship between processing and storage.Memory, 17(6), 635-654.

50 Unsworth, N., & Engle, R. W. (2007). The nature of individual differences in working memory capacity: active maintenance in primary memory and controlled search from secondary memory. Psychological review, 114(1), 104.

Unsworth, N., Spillers, G. J., & Brewer, G. A. (2010). Variation in verbal fluency: A latent variable analysis of clustering, switching, and overall performance. The Quarterly Journal of Experimental Psychology, 64(3), 447-466.

Unsworth, N., Spillers, G. J., & Brewer, G. A. (2010). The contributions of primary and secondary memory to working memory capacity: An individual differences analysis of immediate free recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 36(1), 240.

White, H. A., & Shah, P. (2006). Uninhibited : creativity in adults with attention- deficit/hyperactivity disorder. Personality and Individual Differences, 40(6), 1121-1131.

Wilson, R. C., Guilford, J. P., Christensen, P. R., & Lewis, D. J. (1954). A factor-analytic

study of creative-thinking abilities. Psychometrika, 19(4), 297-311.

Wilson, R. C., Guilford, J. P., & Christensen, P. R. (1953). The measurement of individual

differences in originality. Psychological Bulletin, 50(5), 362.

51 Appendix. Rating Procedure

We developed a quality-based scoring procedure that placed a high emphasis on scoring original responses. Each rater was told to evaluate only those responses that evoked a subjective impression of creativity. This way each rater only considers creative responses when making their judgments. A common rule set is used for both verbal and spatial tests of creativity for domain general measurement that cannot be obtained through fluency-based scoring.

This procedure will allow raters to identify and describe creative responses. When a rater judges a given response to be creative, they describe what made the response creative by indicating one or more of the following criteria on a scale of 1-3.

1. Humor – if the response is light-hearted and funny.

2. Ironic– response is opposite to what is expected, usually dry humor.

3. Smart – response incorporates crystallized knowledge.

4. Insightful – if the response is viewed from a different perspective than the norm.

Activity 1: When raters identify a creative response in the verbal Consequences task, they document a unique word from the response so the researchers could identify if each rater scores the same response.

Activity 2: In the Picture Completion task, raters indicate whether a creative response is made using only the left-hand image, only the right-hand image, or a combination of both images.

Activity 3: On the Picture Construction task, subjects were asked to create one or more images using nine triangles. Subjects had the option to title their responses. All four qualities listed above are rated on a scale from 1-3 and the title is given a 1 or a 0 depending on whether it added significantly to the response’s overall cleverness or originality.

52