A VALIDATION OF A -BASED ASSESSMENT FOR THE MEASUREMENT OF VOCATIONAL INTEREST

A Thesis submitted to the faculty of San Francisco State University In partial fulfillment of the requirements for the Degree

Master of Science

In

Psychology: Industrial/Organizational Psychology

by

Hope Elizabeth Wear

San Francisco, California

May 2018 Copyright by Hope Elizabeth Wear 2018 CERTIFICATION OF APPROVAL

I certify that I have read A Validation of a Game-Based Assessment for the Measurement of Vocational Interest by Hope Elizabeth Wear, and that in my opinion this work meets the criteria for approving a thesis submitted in partial fulfillment of the requirement for the degree Master of Science in Psychology: Industrial/Organizational Psychology at San

Francisco State University.

Chris Wright, Ph.D. Professor A VALIDATION OF A GAME-BASED ASSESSMENT FOR THE MEASUREMENT OF VOCATIONAL INTEREST

Hope Elizabeth Wear San Francisco, California 2018

Game-based assessments (GBAs) are a new type of technologically-based assessment tool which allow for traditional selection concepts to be measured from behaviors (e.g., completing levels by following game rules). GBAs use game elements to create an immersive environment which changes how assessments are traditionally measured but retains the psychometric properties within the game to assess a variety of knowledge, skills and abilities. In this study we examined the validity of a GBA for use as a measure of RIASEC vocational interests from Holland (1985). Participants played the GBA as well as completed traditional measures of RIASEC interests. We compared the scores from participants for congruence across the different measures using a multitrait-multimethod matrix (MTMM). The results from this study will be used to provide validation evidence for the use of GBA as a method to measure vocational interests and to extend the RIASEC literature to the area of technological assessments.

ct representation of the content of this thesis.

Date ACKNOWLEDGEMENTS

First, I would like to thank my advisor, Dr. Diana Sanchez, for all her help and guidance throughout this project. I am grateful Dr. Sanchez came and joined us at San Francisco

State and took on a thesis project in her first year as a faculty member. In addition, I would like to thank Dr. Chris Wright for all his helpful insights and lessons about selection and assessments throughout the program. I would also like to thank the team at Artie Shores,

Lara, Liam and Kirsty, who made this project possible and for all their help throughout the process. I would also like to thank my family for all their support over the years which allowed me to get to this point. Finally, I would like to thank my cohort members who have been supportive and helpful throughout this program. It was truly great spending the last two years with you all.

v TABLE OF CONTENTS

List of Tables...... vii

List of Figures...... viii

Introduction...... 1

Game-Based Assessments...... 2

Psychometric Measures...... 10

Cognitive Ability Tests...... 10

Personality Tests...... 12

Applicant Reactions...... 13

Method...... 18

Results...... 23

Discussion...... 33

Reference...... 41 LIST OF TABLES

Table Page

1. Elements of GBAs...... 2 2. Demographic Characteristics...... 24 3. Correlations of Demographic and Work Outcomes...... 26 4. MTMM Matrix for Vocational Interest Measures...... 28 5. Congruence of RIASEC Codes...... 30 6. Congruence of Individual and Occupational Codes...... 31 7. Congruence Correlations with Work Outcomes...... 33 1

A Validation of a Game-Based Assessment for the Measurement of Vocational Interest

Technology is changing the world of selection, hiring and assessment (Arthur,

2017; Chapman & Webster, 2003). The way HR practitioners view technological tools has shifted. In a 2016 survey, 75% of HR practitioners indicated they would consider using or game-based assessment for screening and selection (Cut-E, 2017).

Technology has also changed how assessments are conducted as and mobile based assessments increase in usage (Arthur, 2017; Lowman, 2016; Morelli, Mahan &

Illingworth, 2014). Game-based methods (i.e., using for training or evaluation) are

one example of how technology is changing the nature of work. The gamification and

game-based industry was predicted to be worth $5.5 billion by 2018 (Dale, 2014). Game-

based assessment (GBA) in particular, is being used more by companies to evaluate a

number of different aspects of the workplace, such as individual characteristics and job

skills (Landers, 2014). GBAs are defined as evaluations that use game elements to create

an immersive environment and allow the individual to interact with and demonstrate

certain knowledge or skills (Landers, 2015). Using a game to evaluate an individual

retains the psychometric properties (i.e. ability of the assessment to measure qualities

such as personality characteristics) within the game to assess a variety of knowledge,

skills and abilities (Montefiori, 2016). The increased usage of GBAs in the workforce

increases the need for evidence that these new methods are valid and appropriate for such

uses. As these assessments are new, validation evidence is needed or there could be legal

repercussions for companies who use theses assessments for high-stakes sectors such as 2

such as selection. Currently the amount of research on the validity of GBAs in the workforce is limited and more research is needed (Arthur, 2017; Lowman, 2016;

McCarthy et al., 2017). It is important to distinguish different methods that games, and game elements can be used for.

Table 1 Elements o f GBAs Element Definition Rules Define how the game responds to the actions (or inaction) of the player Objectives Define what the player is meant to be striving for and can be more structured or vague States The environment which arises from the connections between rules and objects Note. Elements from (Mislevy et al, 2015)

This is important because these methods function in different ways and research on the methods have conclusions which cannot be assumed to generalize to the different methods. For example, GBAs are distinct from gamification (Cut-E, 2017; Landers,

2015). A commonly cited definition of gamification is “the use of game design elements in nongame contexts” (Deterding, Sicart, Nacke, O’Hara, & Dixon, 2011, p. 1). While there are similarities between game-based assessment and gamification, game-based assessments are administered within the context of a game which has typically been designed to measure knowledge, skills, and abilities. (For a review of gamification see

Hamari, Koivisto, & Sarsa, 2014). GBAs share components (Table 1) with traditional games such as rules, objectives, objects, and states. These shared elements are combined to create GBAs which are used by businesses to make decisions about selection, 3

development and hiring of potential employees. How these elements are developed to measure important characteristics needs to be validated to determine if the characteristics intended are in fact being measured.

While GBAs have increased in popularity in the workplace, the research into the validity and reliability of these measures has not lead to conclusive evidence (Chamorro-

Premuzic, Winsborough, Sherman, & Hogan, 2016; Kim & Shute, 2015; Lowman, 2016).

Without clear evidence of validity, or lack thereof, the research has fallen behind the adoption in organizations. Due to the lack of conclusive evidence, it is important that more research be conducted to understand how GBAs can be used in the workplace. This

study aims to add to the existing research literature by evaluating the use of GBA in a

specific context (e.g. measurement of vocational interests). This research study, examined the validity of a GBA which was developed for screening employees and job applicants

across a variety of characteristics including personality and cognitive ability. The current

study is examined whether this GBA can also be used to assess vocational interests of

applicants. By evaluating the ability of the GBA to measure this concept, the current

study collected validation evidence for the use of GBA as a measure of vocational

interests.

This type of validation evidence is important because there are currently many companies which develop GBAs including: Arctic Shores, Revelian, Pymetrics, and

Knack (Barrett, 2018; Lowman, 2016). These companies have been conducting research and gathering evidence for the useful applications of their GBAs (Barrett, 2018; Collmus, 4

Armstrong, & Landers, 2016; Cut-E, 2017; Montefiori, 2016). As the market for these products increases it is important that evidence supports their use in the workplace.

Games are not always an appropriate solution which is why it needs to be determined when these tools are successful and when they are not. This information is important for the future of scientific inquiry into these assessments as well as necessary for those organizations who currently use GB As. Organizations may invest a large amount of money into these assessments which makes it crucial for there to be evidence of the best applications of GB A. In addition, assessments are meant to measure very specific concepts and have best practices for use. However, organizations do not always follow the cautious standards for use which are dictated by researchers. Therefore, it is important to have scientific research which supports when the use and applications of these new assessments is appropriate to guide their use in the workplace. In order to review the existing literature on GBAs in the workplace, it is important to first discuss the research which has been conducted with GBAs outside of the workplace.

GBA Outside of Business

In modem research into GBAs, one of the early GBAs to be released was

America’s Army, a military training which was developed by and for the United States Army (Lowman, 2016). The game America’s Army provided potential recruits with a look at what was involved in combat scenarios and practice skills without physically being at a training center (Nieborg, 2004). The ability for prospective recruits to complete the GBA with only a computer allowed for a wider reach and fewer 5

resources needed to have applicants complete such (Nieborg, 2004). The

GBA functioned as an assessment, training mechanism as well as a recruiting tool for the

Army as it brought increased attention to advertisements (Lowman, 2016). The GBA was reported as a success both for the ability to increase awareness and applicant perceptions, but the reduction of costs in the long-term (Susca, 2012). The benefits of this GBA mirror the benefits of current GBAs which are used in a variety of organizations. Although the needs of the Army were very specifically tailored to this specific GBA, the body of research benefited from the information about the effectiveness of this GBA.

Reliability and validity evidence is an important part of research, but this information is not always made available for games which are developed and used in the workplace. However, other industries have published more research on GBAs which can be used to gain insights into this new technology. One such industry is the educational sector. Games have been used extensively in education. This has resulted in a larger body of research into GBAs in education than in the workforce (Landers, 2014; DiCerbo,

2014). To inform the current research study, research developments from the educational sector are discussed to further understand how GBAs can be applied to workplace research. GBA’s have been researched in education to measure knowledge, skills and abilities (KSAs) such as (Shute et al, 2016), persistence (DiCerbo, 2014;

Ventura & Shute, 2013), and specific knowledge of a topic (Kim, Almond, & Shute,

2015). GBAs have been the topic of research in education due to the belief that games are inherently motivating (Landers, 2014; Ryan, Rigby & Przybylski, 2006). It is believed 6

that the motivating aspects of games increase attention of students when learning concepts and completing tasks that are otherwise not perceived as enjoyable by students

(Ryan, Rigby & Przybylski, 2006; Shute et al, 2016). This research into games in education can help inform the research of I/O psychology into possible applications for

GBAs in the workplace. One such aspect from education research which can make an impact on GBAs in the workplace is the importance of GBA design.

Evidence-Centered Design (ECD) is a framework for the development of game- based assessments which is commonly found in the education literature (Shute et al,

2016). By following the framework of ECD, game-based assessment designers increase the validity of assessments by keeping focus on what is meant to be measured at each step of the design process (Shute et al, 2016). ECD has been most commonly included in research literature into the use of game-based assessments in the education sector for assessing knowledge, skills and abilities of students (DiCerbo, 2014; Kim & Shute,

2015). ECD is important because it focuses on creating valid assessments from the beginning which informs game design. ECD guides GBA designers through a process which answers the questions what, where, and how are we measuring, and how much do we need to measure (Kim, Almond & Shute, 2015). GBA’s, such as Physics Playground which assessed students nonverbal understanding of physics, have been developed using

ECD for use with students (Kim & Shute, 2015). Pretest and posttest measurements suggest that the ECD framework allowed for a valid design and provided support for the increased motivation which stemmed from in-game rewards (Kim & Shute, 2015). A 7

greater focus on ECD or similar frameworks in the design of game-based assessments for the workplace purposes may play a role in establishing the validity and reliability of these tools outside of the educational sector (Shute et al, 2016). Since the focus of the current study is on validation of GBAs, it is important to note that validation for specific strategies and uses of GBAs have been validated in the education industry. Research into

ECD in education has shown that thoughtful design and planning is integral to the creation of valid GBAs. Although research into applications in the workforce are limited, it is important to discuss how GBAs are used in organizations. Understanding how GBAs are used in the workplace provides insight into why validation of these measures use is integral.

GBA within Businesses

GBA’s are used for many different purposes within businesses. GBA’s are created for the development and growth of employees (Chamorro-Premuzic, 2016; Lowman,

2016). Employee development GBA’s are used to identify where employees are lacking in KS As and provide employers with information on what areas need to be focused on for the development of the workforce (Barrett, 2018; Lowman, 2016). One example of a

GBA which is used for employee development is from the developer Revelian.

Revelian’s Theme Park Hero requires employees to manage a theme park which measures the problem-solving ability, attention, spatial aptitude and numerical processing ability of participants. Following game-play, the employer is given information about how the participant scored and which aspects they did well at. Another example of GBAs 8

in the workplace is for the training of new hires. GBAs used for training allow for the employees to practice important skills in the game environment. An example of this type of GBA, as previously mentioned, was called America’s Army. This GBA was used in the training of newly recruited soldiers before entering combat training (Lowman, 2016). The

GBA involved recruits playing simulations of missions which are similar to those they may experience in the field. A final application of GBAs is use in selection and hiring of new employees. These GBA’s may measure psychometric variables, such as personality and cognitive ability, which have a long history of use in selection procedures

(Chamorro-Premuzic, 2016; Lowman, 2016). These GBAs are meant to determine the characteristics of applicants which can be used to estimate which candidates are the best for a job position. One example of GBAs for selection is from the company Pymetrics.

Pymetrics markets talent identification game-based assessments to employers (Chamorro-

Premuzic, 2016). Pymetrics assessments are tailored to the specific position and measures cognitive abilities and personality characteristics. Another company called Arctic Shores has develops GBAs for recruitment and selection. Their current offerings of GBAs includes three GBAs (Cosmic Cadet, Yellow Hook Reef, and Skyrise City) which measure psychometric variables ranging from personality and cognitive ability to resilience. These GBAs are accessed exclusively using mobile devices and are downloaded through the GooglePlay or Apple App Store. These are just a subset of the wide range of GBAs available for use in the workplace. With testimonial case studies on the merits of their product such as 100% increase in hire yield and “record diversity” 9

levels, GBA developers provide optimistic views on the benefits to be gained by using their assessments. Despite the promising reports coming from these applications of GBAs it is imperative that peer-reviewed research further explore the applications of GBAs in controlled studies.

The provided examples demonstrate the increasing application of GBAs and other game-based methods in the workforce. However, as mentioned, there is a critical need to more scientific evidence on the validity of using game-based methods for high stakes procedures. To date, little evidence has been gathered on the valid application of GBAs for high stakes decisions such as selection and promotion decisions. This study seeks to explore this question and to address this gap in research. Such research is necessary because organizations need to use practices with foundations in scientific evidence which are valid and legally defensible. This is critical for selection because employment decisions in the United States are subject to the Uniform Guidelines created by the Office of Personnel Management (OPM). The Uniform Guidelines outlines the required validity evidence which is admissible for showing that a selection procedure is appropriate and does not discriminate (OPM, 1979). Following these guidelines is imperative for the ability to defend selection procedures in the event of litigation. The Uniform Guidelines from OPM, are reflected in set principles published by the Society for

Industrial/Organizational Psychology (SIOP). These principles are used for the validation of selection procedures. These principles states that three types of validation evidence are acceptable to use for selection tools: “evidence of validity based on 10

relationships with measures of other variables, evidence based on content, and evidence based on the internal structure of the selection procedure” (Society for Industrial and

Organizational Psychology, 2003, p. 13). These guidelines emphasize the need for validation of selection measures and highlight the increased potential for litigation. Not following the guidelines as described creates a high potential for litigation and loss. The high risk of litigation for organizations makes validation of the many game-based assessments on the market is crucial. While there is some research evidence of validity of

GBA’s in education and on specific GBAs in the workplace, there is not a robust literature of validation evidence for GBAs currently. The current study aims to be a part of the greater body of research which is needed to minimize the risk of litigation in the high stakes avenues of selection and hiring.

GBA and Psychometric Measures

While the general characteristics which are being measured using GBAs was discussed briefly prior, some of the more common variables which are currently being measured using GBAs will be discussed below in more detail. Many of the attributes measured through GBAs currently have long been measured in selection contexts.

However, this new method of measuring these characteristics has strengths and weaknesses when compared to the older measures.

Cognitive Ability Tests. GBA’s have been designed to be comparable to many of the paper and pencil tests for psychometric constructs including cognitive ability, persistence, and personality measures (DiCerbo, Shute, & Kim, 2016; Kim & Shute, 11

2015; Montefiori, 2016). Cognitive ability tests (also referred to as general mental ability or intelligence tests) are a staple of personnel selection measures and have been found to have predictive validity with job performance at a .5 correlation (Hunter & Hunter, 1985;

Schmidt & Hunter, 1998). Research indicates these measures are consistent and valid, however they also come with consistent drawbacks. Despite the strong predictive validity, cognitive ability tests have long been known to be subject to inequality among racial and ethnic minority test takers (Hunter & Hunter, 1985; Meyer, 2014). This can lead to a phenomenon known as adverse impact which is defined in the Uniform

Guidelines as “a selection rate for any race, sex, or ethnic group which is less than four- fifths (or eighty percent) of the rate for the group with the highest rate” (OPM, 1917 p. 9).

Cognitive ability tests can lead to adverse impact by providing disproportionately higher scores for majority members than minorities (Hunter & Hunter, 1984). Despite the high predictive validity with job performance, cognitive ability tests produce high levels of adverse impact and lead to under-representation of minorities. This disproportionate treatment is not only an ethical dilemma but is also illegal. Organizations with adverse impact due to cognitive ability measures are likely to face legal ramifications and financial loss. However, GBAs may provide some alleviation to this issue with cognitive ability.

GBAs provide a unique benefit that may help mitigate the issues raised by using cognitive ability assessments. GBAs of cognitive ability have the potential to minimize the adverse impact of traditional cognitive ability measures due to the way they 12

determine applicant characteristics (Chamorro-Premuzic, 2016; Montefiori, 2016). GBAs of cognitive ability measure cognitive ability by measuring behavior instead of test questions which are susceptible to inequality. GBAs can also measure facets of intelligence such as reaction time and spatial ability (DiCerbo, 2014; Jaffal & Wloka,

2015). These measures of intelligence are less susceptible to adverse impact because they do not rely as heavily on reading level and vocabulary. The behaviorally based measurements of cognitive ability through GBAs may be able to not only provide estimates of a factor which is highly correlated with performance but do so in a way which also minimizes the effect of adverse impact. This possibility highlights a benefit of the new technology as well as creates more reasoning for an increased focused on researching GBAs in the workplace. In addition, GBAs are able to measure personality traits in an indirect manner which may mitigate some of the drawbacks of such assessments as well.

Personality Tests. Personality tests are a very common part of selection batteries

(Chamorro-Premuzic, 2016). The Big Five personality inventory (i.e., openness to experience, extraversion, agreeableness, conscientiousness, and emotional stability) is a personality test which is often used in recruitment and selection (Berrick & Mount,

1995). The Big Five personality test is often used in selection contexts because research has found conscientiousness to be a significant predictor of job performance (Berrick &

Mount, 1995; Ventura & Shute, 2013). However, like cognitive ability, personality measures are subject to a drawback in the traditional measurement. Due to the self-report 13

nature of the traditional measures, personality inventories are susceptible to participants faking “good” to appeal to the employer (Montefiori, 2016; Lowman, 2016). A number of meta-analyses have concluded that self-report surveys are particularly susceptible to candidate faking (Viswesvaran & Ones, 1999). This faking reduces the reliability and validity of selection measures and therefore limits their ability to accurately predict performance. GBAs may help mitigate this issue through measurement of personality through behavioral actions during the game. This minimizes the ability for applicants to respond to questions in a socially desirable manner (Ryan & Ployhart, 2014). In a recent study, participants were instructed to try and create the illusion of an assigned personality when completing a GBA. Participants had difficulty mimicking the qualities of the assigned personality from gameplay behaviors (Montefiori, 2016). This research suggests that GBAs may play a role in reducing faking in selection assessments. In addition to reducing faking, the personality measures which are contained inside of GBA’s do not involve asking participants direct questions which makes them feel more at ease with the personality inventories.

Applicant Reactions

Applicant reactions to selection measures can have an effect on the predictability and reliability of the tests. Although one of the most common selection tools, cognitive ability tests have been categorized as unfavorable in the perspectives of applicants

(Oostrom & De Soete, 2016). One reason for unfavorable applicant reactions is the long length of traditional assessments. Long surveys can lead to applicants experiencing 14

survey fatigue where applicants become unhappy and fatigued by the survey itself.

Survey fatigue can reduce the reliability and predictability of the selection measures. In a study of game-based assessments compared to traditional self-report measures, participants indicated that they perceived the GBA as better than self-report measures. In addition, participants considered companies who used the GBA more innovative than those which used traditional self-report measures (Montefiori, 2016).

Current Study

Validation is a key part in the process of developing a reliable and accurate selection tool (Schmidt & Hunter, 1998; Ventura & Shute, 2013). Practitioners need research to show whether these new tools will be able to provide the information claimed.

To use these tools, such proof is vital and will open GBA’s for more widespread adoption. In order for businesses and applicant to gain the benefits of GBA over traditional measures, there must first be evidence that these benefits are valid and testable. This becomes more important as the responsibilities and power of the positions being selected for increases. The current study focused on the validation of a GBA called

Skyrise City™. This GBA was evaluated for use as a measure of a vocational interests based off Holland’s (1959,1985) theory of vocational interests. This study evaluated the validity of the use of a new technological measure of the RIASEC interests by comparing the GBA results to that of established measure. In addition, the degree to which positive outcomes associated with job congruence could be predicted using this new measure was also evaluated. 15

Vocational Interests and Congruence

Applicants may be screened for how well they fit in the job they are applying to.

This fit, also referred to as congruence, is an indication of how closely an individual is suited to a potential position based on factors such as personality, vocational interests and skills. Vocational interests are the preferred types of work for an individual. Vocational interests can be measured using a set of codes which were developed by Holland (1959).

Holland’s codes are comprised of six different personality types: Realistic, Investigative,

Artistic, Social, Enterprising and Conventional (commonly referred to as RIASEC for short) (Holland, 1959, 1985; Nauta, 2010). Each individual is assigned a 2 or 3 letter code which corresponds to their strongest vocational preferences. The RIASEC types have been found to among a variety of different groups including high school students, college students, and working adults (Holland, 1985; Nauta, 2010). When this code is compared to the code of the participants’ occupation, the RIASEC scores are used as a measure of vocation congruence (Hutchinson, 2014). Vocational congruence has been linked to several work outcomes such as job satisfaction (Holland, 1985; Marcus &

Wagner, 2015; Nauta, 2010), performance (Nauta, 2010; Spokane, Meir, & Catalano) and turnover intention (Holland, 1985; Tracey, 2000).

Validation. As discussed prior, the goal of the current study was to evaluate the effectiveness of a game-based assessment in indicating an individual’s Holland code without directly asking them a series of questions like other RIASEC measures do

(Armstrong, Allison & Rounds, 2008; Holland, 1985). In order to test the validity of this 16

new method, three methods of measuring Holland’s codes will be compared using a multi-trait multi-method matrix (MTMM) (Campbell & Fiske 1959). The three methods include the GBA assessment, the Interest Item Pool Form A and Form B (Armstrong et al, 2008). The MTMM allows for the evaluation of construct validity through convergent and divergent validity (Campbell & Fiske, 1959). The MTMM matrix was chosen to evaluate validity in lieu of other analyses (e.g. CFA, factor analysis) because of the use of gameplay behaviors as measurement instead of traditional items which are needed to conduct these other methods. Although MTMM matrix is used less often in recent literature, the method provides a robust evaluation of convergent and divergent validity

(Campbell & Fiske, 1959). Recent research has found evidence for the continued reliability of using MTMM for analyzing validity for assessments which are used in human resource management (Hamdani, Valcea & Buckley, 2016).

Hypotheses 1. On average, the RIASEC code derived from the Game-Based

Assessment with be congruent with the score from the pen and pencil

assessments.

Congruence. The level of congruence between the vocational interests of an individual and an occupation adds a level of depth beyond the individual RIASEC code.

Holland (1985) ascertained that those with different combinations of RIASEC interest codes, or types, require different environments which match their skills, personality, and competencies. One such way of seeking out congruent environments is through career 17

choices. Due to this it is believed, that individuals will seek out careers in positions which match their interests if it is possible (Holland, 1959,1985). While not all positions available will be completely congruent with the individuals vocational interest, there is a degree of congruence between the job characteristics and the individual’s vocational interests. This level of congruence can be calculated using a measurement known as the

C Index (Brown & Gore, 1994; Eggerth & Andrew, 2006). The C index is calculated using the three highest Holland codes for an individual and an occupation. By weighting the codes based on the order, the C Index creates a congruence value which ranges from

0 (no congruence) to 18 (full congruence) (Eggerth & Andrew, 2006). In the present study, the C Index was used to assess the level of congruence for participants and their current occupation. For students, the level of congruence between participants and their preferred job was analyzed to account for the possible distance between vocational interest and job availability for the university students in our sample.

Hypothesis 2: On average, the RIASEC code derived from the game-based

assessment will be congruent with the code of the participants’ occupation.

Hypothesis 2a: If the participant is not in preferred role, the RIASEC code derived

from the game-based assessment will be congruent with the code of the

participants preferred occupation

Work Related Outcomes. As discussed prior, the level of congruence can have an

impact on work outcomes such as job satisfaction, turnover intention, and job 18

performance (Holland 1985; Nauta, 2010; Spokane, Meir, & Catalano). However, the existing research which found relationships between congruence and outcomes were performed using traditional measures of the Holland codes. In order to ascertain the validity of the GBA derived RIASEC codes and traditional measures, it is also important to see if the congruence level from the new measure predicts these work outcomes. In the current study the relationship between congruence (as measured through the C index) and participants job satisfaction, organizational commitment, and turnover intention was investigated.

Hypothesis 3: On average, higher levels of congruence will be related to positive

work outcomes.

Hypothesis 3a: On average, higher levels of congruence will be positively related

to job satisfaction.

Hypothesis 3b: On average, higher levels of congruence will be positively related

to job fit.

Hypothesis 3c: On average, higher levels of congruence will be negatively related

to turnover intention.

Method

Materials and Apparatus

As part of the study participants played the game-based assessment Skyrise City

™, which required a mobile device. In order to take part in the study, participants 19

downloaded the game from the Apple App Store or the GooglePlay store onto their mobile smartphone or tablet. Android devices required Android software version 4.1 and up; Apple devices required iOS 9.0 or up. All other survey measures were completed online through survey software.

Procedures

A within-subjects study was conducted on participants from a university and online sample. Participants first completed an online consent form, then downloaded the mobile application or were directed to a mobile device where they could play the game in the research lab. Participants were given an anonymized player key required to start the program. Participants played the game-based assessment which took approximately 50 minutes to complete. Once the assessment was finished, participants completed the online survey which included demographic questions and the measures listed above.

Measures

Demographics. Participants were asked to indicate their gender and ethnicity.

Participants were asked if they were students and if they were currently employed.

Students were asked what their year of study (e.g. freshman) and major was at the time of the study. Students were also asked why they chose their major.

Vocational Interests

Game-Based Assessment. A -based assessment created by Arctic

Shores was used to measure vocational interests. The game was developed to measure many facets of personality by recording paradata from participants as they play a game 20

on a mobile smartphone or tablet. Vocational interests were gathered from the game- based assessment Skyrise City™. Through gameplay, the GBA gathered personality information about participants and produced an estimate of the participants vocational interests based on Holland’s (1985) model. The GBA consisted of 8 different levels which involve completing a variety of tasks based on established research paradigms from many subdisciplines of psychology. Participants progressed through levels and gained rewards for completion of levels. The GBA took approximately 40 minutes to an hour for participants to complete.

Interest Item Pool. The Interest Item Pool (IIP; Armstrong et al., 2008) is shortened version of the 0*Net Interest Profiler (Lewis & Rivkin, 1999). The IIP was used to measure Holland’s (1985) RIASEC interests. The IIP consists of 40 work activities which participants indicate how they would feel about performing each activity on a 7-item Likert-type scale ranging from 1 (Dislike a great deal) to 7 (Like a great deal). The IIP results in a mean score for each of the Holland code areas (Realistic,

Investigative, Artistic, Enterprising and Conventional). The three largest of these means are combined in descending order to determine the RIASEC code for an individual (ex.

RIA). The alphas for each of the Holland codes ranged from .85 to .93.

Congruence

C index. The RIASEC codes derived between measures was evaluated using

Brown and Gore’s (1994) C index. The C index has a possible range of 0 (none) to 18

(perfect) (Brown & Gore, 1994). The C index evaluates the similarity of RIASEC profile 21

codes based on the hexagon structure outlined in Holland’s (1985) theory of vocational interests. The C index formula is C = 3(x) + 2(x) + (x), in which x is equal to 3,2,1 or 0 based on the distance between the codes in the hexagon. A 3 is assigned if the codes are the same, 2 is for adjacent types, 1 is for alternate types, and 0 is for opposite types. For example, an ISA individual with a SAE job environment would have the formula C =

3(1) + 2(2) +1(1) for a score of 8. Holland recommended the C index because it considers the order of codes and is consistent with the theory (Holland, 1985).

Outcomes

Job Satisfaction. Job satisfaction was measured with a single item: “All in all, I am satisfied with my job.” answered on a 5-item Likert-type scale ranging from 1

{Strongly Disagree) to 5 {Strongly Agree). In accordance with Wanous (1997), a one item job satisfaction measure was deemed to be sufficient for measuring this concept.

Job Fit. Job fit was measured using a Person-Job fit scale from Brkich, Jeffs, and

Carless (2002). The items were measured on a 5-item scale ranging from 1 {Strongly

Disagree) to 5 {Strongly Agree). An example item is “I find my current job motivating”.

The alpha reliability for this scale was .92.

Job Calling. The Calling and Vocation Questionnaire (CVQ) was used to measure the amount of calling participants felt towards their current career or their career as a whole if they are not employed in a job they feel is part of their career (such as university students who are working part-time). The scale from Dik, Eldridge, Steger and

Duffy (2012) consists of 24 questions responded to on a 4-item scale ranging from 1 {Not 22

at all true of me) to 4 (Absolutely true of me). An example item is “My career is an important part of my life’s meaning”. The alpha for this scale was .94.

Turnover Intention. Turnover intention was measured using a 3-item withdrawal intention scale from Cohen (1998) which was developed around the Mobley, Homer and

Hollingsworth (1977) model of employee turnover. The items measured whether participants were thinking of quitting, intention to search and intention to quit and were measured on a 5-item scale ranging from 1 {Strongly Disagree) to 5 {Strongly Agree). An example item from this scale is “I think a lot about leaving the organization”. The alpha for this scale was .92.

Job Type. Participants were asked to indicate their current job title. If their job title is not within their field or career objectives they were asked to also indicate their preferred job title. Participants were given categories and subsequent job titles which have known RIASEC codes so by selecting a job title, participants indicate which type of vocation they work in according to the Holland codes.

Video Game Experience (VGE). The Video Game Experience scale from

Sanchez (Sanchez, Langer & Herring, 2018) was used to measure how much experience with video games the participants had. This was used to evaluate if level of experience with video games had an effect on the results of the game-based assessment. The scale is measured on a 5-item Likert type scale ranging from 1 (Strongly Disagree) to 5 {Strongly

Agree). An example item from this scale was “I spend many hours each week playing video games”. The alpha reliability for this scale was .95. 23

Results

Participants

Participants were recruited from a pool of students in a Western public university in the United States. Permission was obtained from the Institutional Review Board, 36

(22.2% male, 77.8% female) students participated in the study. The mean age of these student participants was 26 years (SD = 8 years). Of these students, 3.6% reported being enrolled as Freshmen, 6.4 % as Sophomores, 14% as Juniors, 14.3% as Seniors and

20.7% as Graduate Students; 27.8% self-identified as Asian, 27.8% as Hispanic or

Latino, and 33.33% as White/Caucasian. Due to the slow recruitment of student participants, a second sample of participants were recruited from an online research sourcing tool Prolific. Through this recruitment platform, 104 (51.9% male, 45.2% female) individuals participated in the study. The mean age of these online participants was 30 years (SD =10 years) with a range from 18 to 61 years. Of these participants,

2.9% self-identified as African American, 8.7% as Hispanic or Latino, 12.5% as Asian, and 69.2% as White/Caucasian. A summary of participant distribution for both samples is provided in Table 2.

Table 2

Demography Characteristics Total University Prolific Variable («= 140) (n =36) (n = 104) n (%) n (%) n (%) Gender Female 75 (53.6%) 28 (77.8%) 47 (45.2%) 24

Total University Prolific Variable (n = 140) (n =36) (n= 104) n (%) n (%) n (%) Male 62 (44.3%) 8 (22.2%) 54 (51.9%)

Other 2(1.4%) 0 (0.0%) 2(1.9%)

Unknown 1 (0.7%) 0 (0.0%) 1 (1.0%)

Employment Status

Employed Full-Time 55 (39.3%) 1 (2.8%) 54 (51.9%)

Employed Part-Time 24(17.1%) 3 (8.3%) 21 (20.2%)

University Student Not Employed 30(21.4%) 18(50.0%) 12(11.5%)

University Student Employed 31 (22.1%) 14 (38.9%) 17 (16.3%) Ethnicity

Asian 23 (16.4%) 10(27.8%) 13 (12.5%) Black, African American 3 (2.1%) 0 (0.0%) 3 (2.9%)

Hispanic or Latino/a 19(13.6%) 10(27.8%) 9 (8.7%)

Native Hawaiian/Pacific Islander 1 (.7%) 1 (2.8%) 0 (0.0%)

Two or More Races 8 (5.7%) 3 (8.3%) 5 (4.8%) White or Caucasian 84 (60%) 12 (33.3%) 72 (69.2%)

Other/Non-Disclosed 2(1.4%) 0 (0.0%) 2(1.9%)

Age M(SD) M(SD) M (SD) 28.4 (9.7) 25.5 (8.1) 30.0(10.0) Study Correlations

Before testing the hypotheses, we reviewed the correlations between the continuous demographic variables (i.e., age and video game experience) along with scores from the outcomes variables examined in the study. Due to having a large number 25

of variables in this study (e.g., six variables for each of the vocational interest measures) correlations between variables that were measured for specific hypotheses are presented in tables later in the results. For the first demographic measure, age was significantly related to turnover intention (r = -.20, p = .02), job fit (r = .25, p = .003), and job satisfaction (r — .24, p = .004), demonstrating that in our sample, older participants tended to report less intention to leave their current job, better fit with their current job and higher satisfaction with their current job. This makes sense given that most individuals tend to move more towards their ideal career as they grow older. Being closer to their ideal career would reasonably be associated with experiencing positive reactions to their job (i.e., fit and satisfaction) and feel less inclined to leave that position.

For the second demographic measure, video game experience had a significant relationship with one variable, job calling (r = -.18,/> = .03). Thus, a participant’s video game experience was related to their sense of calling felt for their current position where people with more video game experience reported lower levels of job calling. One possible explanation for this finding is that video games are often viewed as a leisure activity. Those who are at a stage in their life where they don’t perceive themselves to be in their career of choice (i.e., lower sense of calling towards their current career) may also have more time for leisure activities (e.g., students, part time workers). Although the same could be said for those nearing the end of their career such as individuals facing retirement, there was not a large representation of late career individuals in the study

samples. 26

Table 3

Correlations o f Demographics and Work Outcomes (n = 140)

Variables 1 2 3 4 5 6

1. Age -

2. Video Game Experience -.12 - * N ( o i* 3. Tumover Intention .11 -

4. Job Fit .25** -.10 -.78** -

5. Job Calling -.03 -.18* .02 .18* -

6. Job Satisfaction .24** -.07 -.67** .88** .21* -

7. GBA RIASEC Code .06 -.16 -.01 -.03 -.07 .02

* Correlation is significant at the 0.05 level (2-tailed). ** Correlation is significant at the 0.01 level (2-tailed). GBA = Game-based assessment, RIASEC — vocational interest code (realistic, investigative, artistic, enterprising, conventional.

There were also several significant relationships between the study outcome variables, with job fit being significantly related to turnover intentions (r = -.78, p =.00), job calling (r = .18,/? =.03), and job satisfaction (r = .88, p =.00), and job satisfaction further being significantly related to turnover intentions (r = -.67, p < .00) and job calling

(r = .21, p = .01). While these correlations were significant, an important note to make is that the relationships were in different directions with different strengths indicating that these outcomes are likely not measuring the same construct. These findings are provided

in Table 3. 27

Hypotheses Testing

Hypothesis 1 stated that the game-based assessment would produce RIASEC codes similar to those produced by the self-report measures. Findings demonstrated that the game-based format of the vocational interest measure did not converge with either self-report version. To evaluate the validity of the GBA as a measure of vocational interest, convergent and discriminant validity was analyzed using a MTMM matrix. The

MTMM matrix, shown in Table 4, displays the correlations for the six Holland codes (R,

I, A, S, E, C) across the three measures of vocational interests. As seen in Table 3, none of the scores for the codes converged with the self-report 1 (IIP-A) or self-report 2 (IIP-

B) respectively; Realistic (r = .08; r = .06), Investigative (r = .08; r = .15), Artistic (r =

.01; r = -.01), Social (r = .10; r = -.06), Enterprising (r = .01; r = -.04), and Conventional

(r = .13; r = -.04). Discriminant validity (mono-trait correlations greater than hetero-trait

correlations) was also not found for the GBA. In contrast to our expectation, the two

significant correlations found between the GBA method and the traditional self-report 1 method were between different traits, R-C (r = A9,p = .03) and S-A (r = .23,p = 005).

Thus no evidence was found that the GBA form of vocational interests functions

similarly to the self-report measures of vocational interest. 28

Table 4

Multi-trait Multimethod Matrix for Vocational Interest Measures (n - 140) Variables Game-Based Self-Report Self-Report 2 GBA R - I .11 - A -.10 .12 - S .24 .16 .43 E .00 .01 .00 .17 C .09 .10 -.26 -.15 .11 SRI O' R .08 .04 -.07 -.04 .11 .03 CO£ I -.09 J8 .02 .06 -.03 -.04 .14 (.88) A -.03 .04 .01 .23 -.04 -.02 .17 .51 (.85) S -.05 -.11 -.09 JO .08 .06 .08 .37 .29 (.87) E -.08 .12 -.06 -.01 =01 .08 .19 .25 .34 .32 (-87) C .19 .11 -.14 -.07 .04 .13 .32 .06 -.04 .13 .40 SR 2 R M .11 -.14 -.13 .04 .08 .75 .15 .18 .06 .40 .38 (.89) I -.14 -15 .05 .06 -.01 .05 .15 .73 .34 .35 .42 .19 .20 (89) A -.14 .07 -.01 .16 -.10 .03 .19 .43 84 .20 .34 -.03 .23 .39 (85) S .00 -.03 -.06 .06 .13 .01 .05 27 .23 83 .26 .07 .10 .31 .19 (-90) E -.12 .08 .03 .01 .04 .05 .18 .25 .35 .27 .78 .38 .38 .33 .33 26 (.87) C .08 .08 -.12 -.10 .04 .12 .24 .02 -.09 .08 .33 .86 .31 .21 -.05 .05 .39 (.93)

Note: Correlations at or above r - .17 are significant at .05; at or above r =.23 are significant at .01 Convergent validity diagonals are bolded, convergent validity diagonals for GBA and traditional methods are bolded and underlined Cronbach's alpha for the scales are along the diagonal in parentheses R= Realistic, /= Investigative, A-Artistic, S-Social, E-Enterprising, C-Conventional 29

Hypothesis 1 was furthered evaluated for congruence (i.e. how similar the

RIASEC codes were between the GBA and self-report scales) by analyzing the level of congruency between the codes derived from each measure. Congruence between the

RIASEC codes derived between measures was evaluated using Brown and Gore’s (1994)

C Index. Means and standard deviations for the C index are reported in Table 4. The range for the congruence between the GBA and both self-report measures was 0-18. This range indicated that for some participants the GBA and the self-report scales generated the exact same RIASEC code which would result in a C index score of 18. The level of congruence between the GBA derived RIASEC codes and the self-report scales was moderate (mean greater than 6, but less than 13), with the GBA reporting higher congruence with the IIP-A (M= 9.07, SD = 3.64) than the IIP-B (M= 8.68, SD = 3.93).

However, this difference was non-significant (/(l39) =1.11 ,p = .27). The two self-report scales were more congruent with each other than the GBA, but the congruence was not above the baseline (M of 13 or more) for high congruence (M= 11.70, SD = 4.25). This evaluation of congruence was consistent with the findings demonstrated by the MTMM, the moderate mean level of congruence between GBA and self-report scales is not sufficient to determine construct validity of the new measure. Hypothesis 1 was not supported by the MTMM as evidence did not support construct validity between the

GBA and traditional self-report RIASEC measures and the GBA did not produce similar scores to the self-report measures. 30

Table 5

Congruence of RIASEC Codes (n = 140)

Measures M SD Range

C index

GBA-IIP-A 9.19 3.58 (0-18)

GBA-IIP-B 9.10 3.70 (0-18)

IIP-A-IIP-B 9.88 4.09 (0-18)

Note. Note: C index scores (>6 = low, 6-13 = moderate, 13+= high congruence). GBA = Game-based assessment, IIP-A Interest Item Pool Version A, IIP-B = Interest Item Pool Version B (Armstrong et al., 2008)

Hypothesis 2 stated that the GBA would produce RIASEC codes for individuals which were congruent with the participants’ current occupation. Congruence was also measured using the C index to determine the level of similarity between the RIASEC code for the individual and the code for the current job type. Findings demonstrated that the GBA and self-report measures produced results which differed from the participants current occupations at a similar level. Means and standard deviations for the congruence scores between each of the measures of vocational interest are included in Table 6. The average C index for the GBA and job was again moderate (M= 9.10, SD = 3.79). Similar means were found for the self-report measures and the job codes; IIP-A (M= 9.19, SD =

3.58), IIP-B (M= 9.88, SD = 4.09). Although the measures did were not congruent to the extent that was predicted (i.e., all measures of congruence were below the established baseline of M- 13), on average all three scales moderately matched the RIASEC job 31

codes. This indicates that all three scales were able to predict the RIASEC of the

participants’ jobs generally to the same extent. The results demonstrate partial support for

Hypothesis 2 as there was a moderate level of congruence between the GBA code and job

code. In addition to this, the difference between the GBA vocational interest codes and job interest codes were similar across all the measures of vocational interest measured.

The mean level of congruence between the GBA and job codes was not significantly

different from the job congruence of the IIP-A (t (139) =.195,/? = .85) or the IIP-B

(/(139) =-1.51, p = . 13).

Table 6

Congruence o f Individual and Occupational Codes (n = 140)

Measures M SD Range

C index

Job-GBA 9.19 3.58 (0-18)

Job-IIP-A 9.10 3.70 (0-17)

Job-IIP-B 9.88 4.09 (0-18)

Note: GBA - Game-based assessment, IIP-A ^Interest Item Pool Version A, IIP-B = Interest Item Pool Version B (Armstrong et al., 2008)

Hypothesis 3 stated that greater congruence between individual RIASEC codes

and job codes, would be significantly related to work outcomes. The expectation was that

individuals who were more closely aligned between the assessments indication of their

ideal job and their current job would also be significantly related to positive work 32

experiences. Findings indicated that the level of congruence between the GBA and job codes were not related to greater levels of positive work outcomes. Across the vocational interest measures, there was no evidence of congruence and work outcomes being related.

As displayed in Table 7, correlations between work outcomes and congruence were small for all three measures. Only one relationship between congruence and outcomes was significant: congruence level between job and IIP-A codes was positively related to job fit (r = .22, p = .01). Predictive validation was not found for these measures, however the patterns of correlations across measures indicate that the GBA performed similarly to both self-report measures as was seen with hypothesis 2. Hypothesis 3 was not supported because level of congruence between job and GBA did not produce significant relationships with higher levels of turnover intention, job fit, job calling or job

satisfaction. The job congruency of the vocational interest measures was not strongly

related to outcomes, but the different methods resulted in similar patterns of non­

significant relationships with these variables.

Despite the lack of support for the study hypotheses we were greatly encouraged

by the results which demonstrated similar behavior between the GBA measure and self-

report measures of vocational interest. We discuss this similarity and the implications this

may have for future research. We also further explore what other factors may help

explain these results. 33

Table 7

Congruence Correlations with Work Outcomes (n = 140)

Variables 1 2 3 4 5 6

1. Turnover Intention -

2. Job Fit -.78** -

3. Job Calling .02 .18* -

4. Job Satisfaction -.67** .88** .21* -

5. C index Job-Game -.10 .04 .00 -.01 -

6. C index Job-IIP-A -.10 .22* .00 .15 -.02 -

7. C index Job-IIP-B -.10 .20 .10 .13 .00 .37**

**Correlation is significant at the 0.05 level (2-tailed). ** Correlation is significant at the 0.01 level (2-tailed). GBA = Game-based assessment, IIP-A =Interest Item Pool Version A, IIP-B = Interest Item Pool Version B (Armstrong et al., 2008)

Discussion

In this study, the validity of a vocational interests as measured by a GBA was

compared to vocational interests which were measured through self-report measures. This was evaluated by student and online sample of participants who completed both the GBA and established measures of vocational interests (Armstrong et al., 2008). The scores congruency of the two types of measures was evaluated by MTMM matrix and through the C index measure of congruency of vocational interest scores from Brown and Gore 34

(1994). The majority of hypotheses tested were not supported, however partial support for some of the hypotheses tested led to more information about how GBAs should be conceptualized and validated.

Hypothesis 1 tested whether the GBA measure of vocational interests produced the same results as two traditional self-report measures. Convergence between the two types of measures was not found. This absence of convergence indicates that the different methods of assessment are not measuring the same things and therefore cannot be viewed as parallel forms. This indicates the GBA cannot be used in the same way at the self- report measures of vocational interest and be expected to perform in the same way.

However, GBAs have different characteristics which make it unrealistic that they would function as a replacement of the self-report measures, but this does not indicate that

GBAs are not useful. Even though the GBA does not behave the same as the self-report measures, it is important to investigate how the GBA does perform and how this different performance might still be useful.

One key aspects and benefits of GBAs is that measurement is made through behavioral actions rather than self-report of items. Self-report assessments have been studied in regard to their susceptibility to biased responses due to participants responding in accordance with social desirability (Moorman & Podsakoff, 1992). This is of concern for organizational research as participants are motivated in high stakes situations such as selection to respond as they believe the organization wants them to respond (Donaldson

& Grant-Vallone, 2002; Ryan & Ployhart, 2014). As GBAs do not measure responses 35

through self-report, it is likely that there will be differences in scoring when the ability to respond according to social desirability is stripped away (Landers, 2015). This difference in measurement may limit the ability to find convergence between self-report and GBA measures in the current study as traditional measures of vocational interests are based on self-report measures (Armstrong et al., 2008; Holland, 1985; Rounds, Su, Lewis, &

Rivkin, 2010).

Holland’s (1985) theory of vocational interests began with the use of self-report to determine such interests, however, the GBA used in the current study determined vocational interests through the measurement of personality variables. This is consistent with Holland’s theory in which he described vocational inventories as personality inventories (Holland 1959,1985). Holland referred vocational interests as “the expression of personality in work” (Holland, 1985, p 7). In line with this conceptualization of vocational interests, research has focused on how the Holland codes align with personality inventories such as the big five (Armstrong & Anthoney, 2009; Costa,

McCrae, & Holland, 1985; Nauta, 2010; Wille, De Fruyt & Kozlowski, 2014). Therefore, the use of personality measures to extract vocational interest profiles may produce different results than the traditional method of self-report but is still in line with

Holland’s original conception of vocational interests as expressions of personality. In this way, the GBA may produce vocational interests which are stripped of social desirability of any type of work and in-line with the personality of the individual which may explain the partial support found for hypothesis 2. 36

Hypothesis 2 investigated the ability of the GBA to produce vocational interest profiles similar to that of the participants’ current job environment. On average, the GBA vocational interest profiles were moderately congruent with those of the job.

Interestingly, the traditional self-report measures also produced moderate levels of congruence between individual and job RIASEC profiles. Neither measure of vocational interest produced high levels (above 13 on the C index) of congruence with the job profiles. Despite the lack of convergence in hypothesis 1 the GBA and traditional measures behaved similarly in their level of congruence with vocational interests of the environment. This finding highlights the issue of using the self-report measures as the gold standard by which to evaluate the validity of the GBA. In the absence of a true measure of vocational interest, the comparison was made using established measures.

However, all samples have error which can affect results. This includes the samples used to develop the established measures of vocational interest. Both the traditional and GBA measures feature some degree of error when measuring vocational interests, but the similarity of the job congruence among these measures indicates that the GBA may still be a useful measure of this construct. Landers (2015) posited that many different forms of validation evidence are needed to converge for the validity of a GBA to be determined. In this study the evidence gathered from MTMM matrix was not sufficient to show validity, but the similarity of the job congruence indicates that the measure is behaving similarly to the current practice. Further evidence for the similarity between the GBA and 37

established measures was found when evaluating the relationship between congruence and work outcomes.

Hypothesis 3 evaluated whether the level of job congruence of the GBA derived

RIASEC profiles was related to job satisfaction, job calling, turnover intention and job fit. Support was not found for this hypothesis as the level of congruence between the

GBA and job was not significantly related to the work outcomes measured. Except for one significant relationship between the job congruence of IIP-A and job satisfaction, level of congruence for both types of measures was not related to potential outcomes.

However, these findings reinforce that of hypothesis 2, as the new measure performed

similarly to the self-report measures. Two possible explanations arise for the lack of relationships found. The first is that the absence of significant relationships may indicate that level of congruence between individual and job interest codes is not related to work

outcomes. However, such relationships have been documented at length in the person-job

fit literature (Marcus & Wagner, 2015; Spokane, Meir, & Catalano, 2000; Tranberg,

Slane & Ekeberg, 1993; Wille, Tracey, Feys & De Fruyt, 2014). The second possible explanation is that in this sample, student and online participants may not have been as motivated and reliable as intended. However, research into the motivation and validity of

student (Greenberg, Gordon, Slade & Schmitt, 1987; Landers & Behrend, 2015; Ward,

1993) and online samples (Behrend, Sharek, Meade, & Wiebe, 2011; Buhrmester,

Kwang, & Gosling, 2011; Peer, Brandimarte, Samat, & Acquisti, 2017) have shown the

usefulness of these convenience samples for research purposes. 38

Any of these explanations are possible, as well as other explanations which have not been accounted for. It is most likely that components of each of these affected the

findings of the current study. The samples may have inherently had differences which limited the ability to find conclusive results at the outset. In addition, the measures used have at least some amount of error which then effects the results and conclusions which can be determined.

Limitations

The current study was limited by the small sample size which was obtained.

Larger samples may be able to minimize the effect of error in sampling or measurement and allow for more confidence in the conclusions from the analyses. In addition, the use

of student population which contained mostly psychology majors may have affected the results. This is also true of the use of an online crowdsourced sample, which may have

unknown variations in the population which affected the results. The current sample also

featured a majority of Caucasian participants and as such may not be as generalizable to a

more diverse population.

Implications

The findings from the current study support the notion that there is value in the use of GBAs. The GBA performed similarly to current measures used to determine vocational interests in the ability to produce individual results congruent with the job

environment and in the relationships found between congruence and outcomes. However, 39

the lack of convergence of the MTMM matrix indicates that the GBA cannot be used as a parallel form of the self-report measures. The use of the GBAs need to be re­ conceptualized from a replacement of traditional methods which would require validity evidence in line with a parallel form. Instead, GBAs need to be used in a way which references their similarities and differences in performance in comparison to the traditional measures. In line with the differences in performance, validation efforts need to take into consideration that these measures do not perform the same as self-report measures.

Current validation strategies may not be the best suited for evaluating these new assessments and a greater shift may be needed to develop a method which is better for evaluating GBAs. Although this kind of change may seem daunting and unlikely, such a change may parallel events which have already occurred in the field of psychological research. In the 19th and early 20th centuries, there was a divisive argument among academics in regard to the use of Bayesian statistics (Zyphur & Oswald, 2012). The methods were often ignored or directly denounced as ineffective (Andrew & Baguely,

2013; Pearson, 1920). However, Bayesian statistics have since experienced a shift in attention and acceptance which runs counter to the history of the methods in psychological research (Andrews & Baguely, 2013; Zyphur & Oswald, 2012). This drastic change could relate to a future change in the guidelines for establishing validity of new measures such as GBAs. The well-established methods which are currently in use 40

may prove ineffective for the new assessments and a new validation strategy may better aid in the development and evaluation of such measures.

However, before such validation strategies develop, it remains important that there is validation of these new assessments which are currently in use in organizations.

The current study builds upon the small body of research into the validity of GBAs and gives further evidence that more validation efforts are needed. The lack of convergence and the similarity of patterns of results leaves issues of validity and use of GBAs open for future inquiries. 41

References

Andrews, M., & Baguley, T. (2013). Prior approval: The growth of Bayesian methods in

psychology. The British Journal of Mathematical and Statistical Psychology,

66(1), 1-7.

Armstrong, P., & Anthoney, S. (2009). Personality facets and RIASEC interests: An

integrated model.” Journal o f Vocational Behavior, 75(3), 346-359.

Armstrong, P., Allison, W., & Rounds, J. (2008). Development andinitial validation of

brief public domain RIASEC marker scales. Journal o f Vocational Behavior,

73(2), 287-299.

Arthur, W. (2017). The impact o f emerging technologies on selection models and

research: Mobile devices and gamification as exemplars. In J. L. Farr, & N. T.

Tippins (Eds.), Handbook of employee selection (2nd ed., pp. 967-986). New

York: Taylor & Francis/Psychology Press.

Barrett, P. (2018). The EFPA test-review model: When good intentions meet a

methodological thought disorder. Behavioral Sciences, 8(1), 5.

Behrend, T., Sharek, S., Meade, D., & Wiebe, J. (2011). The viability of crowdsourcing

for survey research. Behavior Research Methods, 43(3), 800-813.

Barrick, M., & Mount, M. (1991). the big five personality dimensions and job

performance: a meta-analysis. Personnel Psychology, 44(1), 1-26. 42

Brown & Gore. (1994). An evaluation of interest congruence indices: Distribution

characteristics and measurement properties. Journal o f Vocational Behavior,

45(3), 310-327.

Brkich, M., Jeffs, D., & Carless, S. A. (2002). A global self-report measure of person-job

fit. European Journal o f Psychological Assessment, 18(1), 43-51.

doi: 10.1027//1015-5759.18.1.43

Buhrmester, M., Kwang, T., & Gosling, S. (2011). Amazon's mechanical turk: A new

source of inexpensive, yet high-quality, data? Perspectives On Psychological

Science, (5(1), 3-5.

Chamorro-Premuzic, T. (2016). New talent signals: Shiny new objects or a brave new

world? Industrial and Organizational Psychology., 9(3), 20.

Campbell, D., & Fiske, D. (1959). Convergent and discriminant validation by the

multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105.

Chapman, D. S., & Webster, J. (2003). The use of technologies in the recruiting,

screening, and selection processes for job candidates. International Journal of

Selection & Assessment, 11(2/3), 113-120. doi: 10.1111/1468-2389.00234

Cohen, A. (1998). An examination of the relationship between work commitment and

work outcomes among hospital nurses. Scandinavian Journal of Management,

14( 1), 1-17.

Collmus, A., Armstrong, M., & Landers, R. (2016). game-thinking within social media to

recruit and select job candidates. Social media in employee selection and 43

recruitment: theory, practice, and current challenges, 103-124. 10.1007/978-3

319- 29989-1_ 6.

Costa, PT., Jr., McCrae, R.R., & Holland, J.L. (1984). Personality and vocational

interests in an adult sample. Journal o f Applied Psychology, 69, 390-400.

Cut-e (2017). White Paper: Ahead o f the game. Best practice in games, gamification and

game-based assessment. Retrieved from: https://www.cut-e.com/online

assessment/gamification-in-recruitment/httpwwwcut-ecomgamification-white

paper/

Dale, S. (2014). Gamification. Business Information Review, 31(2), 82-90.

DiCerbo, K. E. (2014). Game-Based Assessment of Persistence. Journal O f Educational

Technology & Society, 17( 1), 17-28.

DiCerbo, K., Shute, V., & Kim, Y. (2016). The future of assessment in technology-rich

environments: Psychometric considerations. Learning, Design, and Technology,

1-21.

Dik, Bryan J., Eldridge, Brandy M., Steger, Michael F., & Duffy, Ryan D. (2012).

Development and validation of the calling and vocation questionnaire (CVQ)

and brief calling scale (BCS). Journal o f Career Assessment, 20(3), 242-263.

Donaldson, S., & Grant-Vallone, I. (2002). Understanding self-report bias in

organizational behavior research. Journal of Business and Psychology, 17(2),

245-260. 44

Hamari, J., Koivisto, J., & Sarsa, H. (2014). Does gamification work? A literature review

of empirical studies on gamification. System Sciences (H1CSS), 2014 47th Hawaii

International Conference on, 3025-3034.

Hamdani, M., Valcea, S., & Buckley, M. (2016). The MTMM matrix approach:

Implications for HRM research. Personnel Review, 45(6), 1156-1175.

Holland, J.L. (1959). A theory of vocational choice. Journal o f Counseling Psychology,

6(1), 35-45.

Holland, J.L. (1985). Making vocational choices: A theory of vocational personalities and

work environments (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall.

Hunter & Hunter (1984). Validity and utility of alternative predictors of job performance.

Psychological Bulletin, 96(1), 72-98.

Hutchinson, T. (2014). Assessing the congruence of worker and workplace using the

proximities of their RIASEC types. Australian Journal of Career Development,

23(3), 133-139.

Jaffal, Y., & Wloka, D. (2015). Employing game analytics techniques in the

psychometric measurement of game-based assessments with dynamic content.

Journal o f E-Learning and Knowledge Society, 77(3), 101-115.

Greenberg, J., Gordon, M., Slade, L., & Schmitt, N. (1987). The college sophomore as

guinea pig: setting the record straight/student guinea pigs: Porcine predictors

and particularistic phenomena. Academy o f Management. The Academy o f

Management Review, 72(1), 157. 45

Kim, Y., Almond, R., & Shute, V. (2015). Applying evidence-centered design for the

development of game-based assessments in physics playground. International

Journal o f Testing, 1-22.

Kim, Y. & Shute, V. (2015). The interplay of game elements with psychometric qualities,

learning, and enjoyment in game-based assessment. & Education, 87,

340-356.

Landers, R. (2014). Developing a Theory of Gamified Learning: Linking Serious Games

and Gamification of Learning. Simulation & Gaming., 45(6), 768.

Landers, R. N. (2015). An introduction to game-based assessment: Frameworks for the

measurement of knowledge, skills, abilities and other human characteristics using

behaviors observed within videogames. International Journal o f Gaming and

Computer-Mediation Simulations, 7(4), iv-viii.

Landers, R., & Behrend, T. (2015). An inconvenient truth: arbitrary distinctions

between organizational, mechanical turk, and other convenience samples.

Industrial and Organizational Psychology, 8(2), 142-164.

Lowman, G. H. (2016). Moving beyond identification: Using gamification to attract and

retain talent. Industrial and Organizational Psychology, 9(3), 677-682.

doi:http://dx.doi.org.jpllnet.sfsu.edu/l 0.1017/iop.2016.70

Marcus, B., & Wagner, U. (2015). What do you want to be? criterion-related validity

of attained vocational aspirations versus inventoried person-vocation fit.

Journal o f Business and Psychology, 50(1), 51-62. 46

McCarthy, J., Bauer, T., Truxillo, D., Anderson, N., Costa, A., & Ahmed, S. (2017).

Applicant perspectives during selection: A review addressing “so what?,” “what’s

new?,” and “where to next?”. Journal of Management, 43(6), 1693-1725.

Mislevy, R. J., Oranje, A. I., Bauer, M., DiCerbo, K., Von Davier, A., John, M., &

Corrigan, S. (2015). Psychometrics and game-based assessment. In Technology

and Testing: Improving Educational and Psychological Measurement (pp. 23-48).

Taylor and Francis.

Montefiori, L. (2016). Game-based assessment: Face validity, fairness perception, and

impact on employer’s brand image. Assessment & Development Matters, 8(2),

19-22.

Moorman, R., & Podsakoff, P. (1992). A meta-analytic review and empirical test of the

potential confounding effects of social desirability response sets in organizational

behaviour research. Journal o f Occupational and Organizational Psychology,

65(2), 131-149.

Morelli, N., Mahan, R., & Illingworth, A. (2014). Establishing the measurement

equivalence of online selection assessments delivered on mobile versus

nonmobile devices. International Journal o f Selection and Assessment, 22(2),

124-138.

Nauta, M. M. (2010). The development, evolution, and status of Holland’s theory of

vocational personalities: Reflections and future directions for counseling 47

psychology. Journal o f Counseling Psychology, 57(1), 11-22.

doi:10.1037/a0018213

Nieborg, D. B. (2004), America's Army: more than a game, transforming knowledge into

action through gaming and simulation. Ed. & Thomas Eberle Willy Christian

Kriz. Munchen:SAGSAGA

Oostrom, J. K., & De Soete, B. (2016). Ethnic Differences in perceptions of cognitive

ability tests: The explanatory role of self-serving attributions. International

Journal o f Selection and Assessment, 24(\), 14-23.

https://doi.org/10-l 11 l/ijsa.12125

Pearson, K. (1920). The fundamental problem of practical statistics. Biometrika, 13( 1),

1-16.

Peer, Brandimarte, Samat, & Acquisti. (2017). Beyond the turk: alternative platforms for

crowdsourcing behavioral research. Journal o f Experimental Social Psychology,

70, 153-163.

Rounds, J., Su, R., Lewis, P., & Rivkin, D. (2010). 0*NET Interest Profiler Short Form

Psychometric Characteristics: Summary and Supporting Evidence. Department of

Labor 0*NET Resource Center

Rupp, A. a., Gushta, M., Mislevy, R. J., & Shaffer, D. W. (2010). Evidence-centered

design of epistemic games: Measurement principles for complex learning

environments. The Journal o f Technology Learning and Assessment, 5(4), 3-41. 48

Ryan, A. M., & Ployhart, R. E. (2014). A century of selection. Annual Review of

Psychology, 65,693-717. doi:http://dx.doi.org.jpllnet.sfsu.edu/10.1146/annurev

psych-010213-115134

Ryan R. M., Rigby C. S, & Przybylski A. K. (2006). The motivational pull of video

games: A self-determination theory approach. Motivation and Emotion.

Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in

personnel psychology: Practical and theoretical implications of 85 years of

research findings. Psychological Bulletin, 124(2), 262-274. doi:l0.1037/0033

2909.124.2.262

Shute, Wang, Greiff, Zhao, & Moore. (2016). Measuring problem solving skills via

stealth assessment in an engaging video game. Computers in Human Behavior,

63,106-117.

Spokane, Meir, & Catalano. (2000). Person-environment congruence and Holland's

theory: A review and reconsideration. Journal o f Vocational Behavior, 57(2), 37

187.

Susca, M. (2012). Why we still fight: adolescents, america's army, and the government

gaming Nexus. Global Media Journal, 12,1-16.

Tracey, T. (2007). Moderators of the interest congruence-occupational outcome

relation. International Journal for Educational and Vocational Guidance, 7(1),

37-45. 49

Tranberg M, Slane S, Ekeberg SE. (1993). The relation between interest congruence

and satisfaction: A metaanalysis. Journal o f Vocational Behavior, 42, 253-264

Ventura, & Shute. (2013). The validity of a game-based assessment of persistence.

Computers in Human Behavior, 29(6), 2568-2572.

Wanous, J.P. (1997) Overall job satisfaction: How good are single-item measures?

Journal o f Applied Psychology, 82(2), 1997, pp. 247-252.

Ward, E. (1993). Generalizability of psychological research from undergraduates to

employed adults. The Journal of Social Psychology, 133(4), 513-519.

Wille, B. & De Fruyt, F. (2014) “Vocations as a source of identity: Reciprocal relations

between big five personality traits and RIASEC characteristics.” Journal of

Applied Psychology.

Wille, B., Tracey, T., Feys, M., & De Fruyt, F. (2014). A longitudinal and multimethod

examination of interest-occupation congruence within and across time. Journal

o f Vocational Behavior, 84(1), 59-73.