A VALIDATION OF A GAME-BASED ASSESSMENT FOR THE MEASUREMENT OF VOCATIONAL INTEREST
A Thesis submitted to the faculty of San Francisco State University In partial fulfillment of the requirements for the Degree
Master of Science
In
Psychology: Industrial/Organizational Psychology
by
Hope Elizabeth Wear
San Francisco, California
May 2018 Copyright by Hope Elizabeth Wear 2018 CERTIFICATION OF APPROVAL
I certify that I have read A Validation of a Game-Based Assessment for the Measurement of Vocational Interest by Hope Elizabeth Wear, and that in my opinion this work meets the criteria for approving a thesis submitted in partial fulfillment of the requirement for the degree Master of Science in Psychology: Industrial/Organizational Psychology at San
Francisco State University.
Chris Wright, Ph.D. Professor A VALIDATION OF A GAME-BASED ASSESSMENT FOR THE MEASUREMENT OF VOCATIONAL INTEREST
Hope Elizabeth Wear San Francisco, California 2018
Game-based assessments (GBAs) are a new type of technologically-based assessment tool which allow for traditional selection concepts to be measured from gameplay behaviors (e.g., completing levels by following game rules). GBAs use game elements to create an immersive environment which changes how assessments are traditionally measured but retains the psychometric properties within the game to assess a variety of knowledge, skills and abilities. In this study we examined the validity of a GBA for use as a measure of RIASEC vocational interests from Holland (1985). Participants played the GBA as well as completed traditional measures of RIASEC interests. We compared the scores from participants for congruence across the different measures using a multitrait-multimethod matrix (MTMM). The results from this study will be used to provide validation evidence for the use of GBA as a method to measure vocational interests and to extend the RIASEC literature to the area of technological assessments.
ct representation of the content of this thesis.
Date ACKNOWLEDGEMENTS
First, I would like to thank my advisor, Dr. Diana Sanchez, for all her help and guidance throughout this project. I am grateful Dr. Sanchez came and joined us at San Francisco
State and took on a thesis project in her first year as a faculty member. In addition, I would like to thank Dr. Chris Wright for all his helpful insights and lessons about selection and assessments throughout the program. I would also like to thank the team at Artie Shores,
Lara, Liam and Kirsty, who made this project possible and for all their help throughout the process. I would also like to thank my family for all their support over the years which allowed me to get to this point. Finally, I would like to thank my cohort members who have been supportive and helpful throughout this program. It was truly great spending the last two years with you all.
v TABLE OF CONTENTS
List of Tables...... vii
List of Figures...... viii
Introduction...... 1
Game-Based Assessments...... 2
Psychometric Measures...... 10
Cognitive Ability Tests...... 10
Personality Tests...... 12
Applicant Reactions...... 13
Method...... 18
Results...... 23
Discussion...... 33
Reference...... 41 LIST OF TABLES
Table Page
1. Elements of GBAs...... 2 2. Demographic Characteristics...... 24 3. Correlations of Demographic and Work Outcomes...... 26 4. MTMM Matrix for Vocational Interest Measures...... 28 5. Congruence of RIASEC Codes...... 30 6. Congruence of Individual and Occupational Codes...... 31 7. Congruence Correlations with Work Outcomes...... 33 1
A Validation of a Game-Based Assessment for the Measurement of Vocational Interest
Technology is changing the world of selection, hiring and assessment (Arthur,
2017; Chapman & Webster, 2003). The way HR practitioners view technological tools has shifted. In a 2016 survey, 75% of HR practitioners indicated they would consider using gamification or game-based assessment for screening and selection (Cut-E, 2017).
Technology has also changed how assessments are conducted as computer and mobile based assessments increase in usage (Arthur, 2017; Lowman, 2016; Morelli, Mahan &
Illingworth, 2014). Game-based methods (i.e., using games for training or evaluation) are
one example of how technology is changing the nature of work. The gamification and
game-based industry was predicted to be worth $5.5 billion by 2018 (Dale, 2014). Game-
based assessment (GBA) in particular, is being used more by companies to evaluate a
number of different aspects of the workplace, such as individual characteristics and job
skills (Landers, 2014). GBAs are defined as evaluations that use game elements to create
an immersive environment and allow the individual to interact with and demonstrate
certain knowledge or skills (Landers, 2015). Using a game to evaluate an individual
retains the psychometric properties (i.e. ability of the assessment to measure qualities
such as personality characteristics) within the game to assess a variety of knowledge,
skills and abilities (Montefiori, 2016). The increased usage of GBAs in the workforce
increases the need for evidence that these new methods are valid and appropriate for such
uses. As these assessments are new, validation evidence is needed or there could be legal
repercussions for companies who use theses assessments for high-stakes sectors such as 2
such as selection. Currently the amount of research on the validity of GBAs in the workforce is limited and more research is needed (Arthur, 2017; Lowman, 2016;
McCarthy et al., 2017). It is important to distinguish different methods that games, and game elements can be used for.
Table 1 Elements o f GBAs Element Definition Rules Define how the game responds to the actions (or inaction) of the player Objectives Define what the player is meant to be striving for and can be more structured or vague States The environment which arises from the connections between rules and objects Note. Elements from (Mislevy et al, 2015)
This is important because these methods function in different ways and research on the methods have conclusions which cannot be assumed to generalize to the different methods. For example, GBAs are distinct from gamification (Cut-E, 2017; Landers,
2015). A commonly cited definition of gamification is “the use of game design elements in nongame contexts” (Deterding, Sicart, Nacke, O’Hara, & Dixon, 2011, p. 1). While there are similarities between game-based assessment and gamification, game-based assessments are administered within the context of a game which has typically been designed to measure knowledge, skills, and abilities. (For a review of gamification see
Hamari, Koivisto, & Sarsa, 2014). GBAs share components (Table 1) with traditional games such as rules, objectives, objects, and states. These shared elements are combined to create GBAs which are used by businesses to make decisions about selection, 3
development and hiring of potential employees. How these elements are developed to measure important characteristics needs to be validated to determine if the characteristics intended are in fact being measured.
While GBAs have increased in popularity in the workplace, the research into the validity and reliability of these measures has not lead to conclusive evidence (Chamorro-
Premuzic, Winsborough, Sherman, & Hogan, 2016; Kim & Shute, 2015; Lowman, 2016).
Without clear evidence of validity, or lack thereof, the research has fallen behind the adoption in organizations. Due to the lack of conclusive evidence, it is important that more research be conducted to understand how GBAs can be used in the workplace. This
study aims to add to the existing research literature by evaluating the use of GBA in a
specific context (e.g. measurement of vocational interests). This research study, examined the validity of a GBA which was developed for screening employees and job applicants
across a variety of characteristics including personality and cognitive ability. The current
study is examined whether this GBA can also be used to assess vocational interests of
applicants. By evaluating the ability of the GBA to measure this concept, the current
study collected validation evidence for the use of GBA as a measure of vocational
interests.
This type of validation evidence is important because there are currently many companies which develop GBAs including: Arctic Shores, Revelian, Pymetrics, and
Knack (Barrett, 2018; Lowman, 2016). These companies have been conducting research and gathering evidence for the useful applications of their GBAs (Barrett, 2018; Collmus, 4
Armstrong, & Landers, 2016; Cut-E, 2017; Montefiori, 2016). As the market for these products increases it is important that evidence supports their use in the workplace.
Games are not always an appropriate solution which is why it needs to be determined when these tools are successful and when they are not. This information is important for the future of scientific inquiry into these assessments as well as necessary for those organizations who currently use GB As. Organizations may invest a large amount of money into these assessments which makes it crucial for there to be evidence of the best applications of GB A. In addition, assessments are meant to measure very specific concepts and have best practices for use. However, organizations do not always follow the cautious standards for use which are dictated by researchers. Therefore, it is important to have scientific research which supports when the use and applications of these new assessments is appropriate to guide their use in the workplace. In order to review the existing literature on GBAs in the workplace, it is important to first discuss the research which has been conducted with GBAs outside of the workplace.
GBA Outside of Business
In modem research into GBAs, one of the early GBAs to be released was
America’s Army, a military training simulation video game which was developed by and for the United States Army (Lowman, 2016). The game America’s Army provided potential recruits with a look at what was involved in combat scenarios and practice skills without physically being at a training center (Nieborg, 2004). The ability for prospective recruits to complete the GBA with only a computer allowed for a wider reach and fewer 5
resources needed to have applicants complete such simulations (Nieborg, 2004). The
GBA functioned as an assessment, training mechanism as well as a recruiting tool for the
Army as it brought increased attention to advertisements (Lowman, 2016). The GBA was reported as a success both for the ability to increase awareness and applicant perceptions, but the reduction of costs in the long-term (Susca, 2012). The benefits of this GBA mirror the benefits of current GBAs which are used in a variety of organizations. Although the needs of the Army were very specifically tailored to this specific GBA, the body of research benefited from the information about the effectiveness of this GBA.
Reliability and validity evidence is an important part of research, but this information is not always made available for games which are developed and used in the workplace. However, other industries have published more research on GBAs which can be used to gain insights into this new technology. One such industry is the educational sector. Games have been used extensively in education. This has resulted in a larger body of research into GBAs in education than in the workforce (Landers, 2014; DiCerbo,
2014). To inform the current research study, research developments from the educational sector are discussed to further understand how GBAs can be applied to workplace research. GBA’s have been researched in education to measure knowledge, skills and abilities (KSAs) such as problem solving (Shute et al, 2016), persistence (DiCerbo, 2014;
Ventura & Shute, 2013), and specific knowledge of a topic (Kim, Almond, & Shute,
2015). GBAs have been the topic of research in education due to the belief that games are inherently motivating (Landers, 2014; Ryan, Rigby & Przybylski, 2006). It is believed 6
that the motivating aspects of games increase attention of students when learning concepts and completing tasks that are otherwise not perceived as enjoyable by students
(Ryan, Rigby & Przybylski, 2006; Shute et al, 2016). This research into games in education can help inform the research of I/O psychology into possible applications for
GBAs in the workplace. One such aspect from education research which can make an impact on GBAs in the workplace is the importance of GBA design.
Evidence-Centered Design (ECD) is a framework for the development of game- based assessments which is commonly found in the education literature (Shute et al,
2016). By following the framework of ECD, game-based assessment designers increase the validity of assessments by keeping focus on what is meant to be measured at each step of the design process (Shute et al, 2016). ECD has been most commonly included in research literature into the use of game-based assessments in the education sector for assessing knowledge, skills and abilities of students (DiCerbo, 2014; Kim & Shute,
2015). ECD is important because it focuses on creating valid assessments from the beginning which informs game design. ECD guides GBA designers through a process which answers the questions what, where, and how are we measuring, and how much do we need to measure (Kim, Almond & Shute, 2015). GBA’s, such as Physics Playground which assessed students nonverbal understanding of physics, have been developed using
ECD for use with students (Kim & Shute, 2015). Pretest and posttest measurements suggest that the ECD framework allowed for a valid design and provided support for the increased motivation which stemmed from in-game rewards (Kim & Shute, 2015). A 7
greater focus on ECD or similar frameworks in the design of game-based assessments for the workplace purposes may play a role in establishing the validity and reliability of these tools outside of the educational sector (Shute et al, 2016). Since the focus of the current study is on validation of GBAs, it is important to note that validation for specific strategies and uses of GBAs have been validated in the education industry. Research into
ECD in education has shown that thoughtful design and planning is integral to the creation of valid GBAs. Although research into applications in the workforce are limited, it is important to discuss how GBAs are used in organizations. Understanding how GBAs are used in the workplace provides insight into why validation of these measures use is integral.
GBA within Businesses
GBA’s are used for many different purposes within businesses. GBA’s are created for the development and growth of employees (Chamorro-Premuzic, 2016; Lowman,
2016). Employee development GBA’s are used to identify where employees are lacking in KS As and provide employers with information on what areas need to be focused on for the development of the workforce (Barrett, 2018; Lowman, 2016). One example of a
GBA which is used for employee development is from the developer Revelian.
Revelian’s Theme Park Hero requires employees to manage a theme park which measures the problem-solving ability, attention, spatial aptitude and numerical processing ability of participants. Following game-play, the employer is given information about how the participant scored and which aspects they did well at. Another example of GBAs 8
in the workplace is for the training of new hires. GBAs used for training allow for the employees to practice important skills in the game environment. An example of this type of GBA, as previously mentioned, was called America’s Army. This GBA was used in the training of newly recruited soldiers before entering combat training (Lowman, 2016). The
GBA involved recruits playing simulations of missions which are similar to those they may experience in the field. A final application of GBAs is use in selection and hiring of new employees. These GBA’s may measure psychometric variables, such as personality and cognitive ability, which have a long history of use in selection procedures
(Chamorro-Premuzic, 2016; Lowman, 2016). These GBAs are meant to determine the characteristics of applicants which can be used to estimate which candidates are the best for a job position. One example of GBAs for selection is from the company Pymetrics.
Pymetrics markets talent identification game-based assessments to employers (Chamorro-
Premuzic, 2016). Pymetrics assessments are tailored to the specific position and measures cognitive abilities and personality characteristics. Another company called Arctic Shores has develops GBAs for recruitment and selection. Their current offerings of GBAs includes three GBAs (Cosmic Cadet, Yellow Hook Reef, and Skyrise City) which measure psychometric variables ranging from personality and cognitive ability to resilience. These GBAs are accessed exclusively using mobile devices and are downloaded through the GooglePlay or Apple App Store. These are just a subset of the wide range of GBAs available for use in the workplace. With testimonial case studies on the merits of their product such as 100% increase in hire yield and “record diversity” 9
levels, GBA developers provide optimistic views on the benefits to be gained by using their assessments. Despite the promising reports coming from these applications of GBAs it is imperative that peer-reviewed research further explore the applications of GBAs in controlled studies.
The provided examples demonstrate the increasing application of GBAs and other game-based methods in the workforce. However, as mentioned, there is a critical need to more scientific evidence on the validity of using game-based methods for high stakes procedures. To date, little evidence has been gathered on the valid application of GBAs for high stakes decisions such as selection and promotion decisions. This study seeks to explore this question and to address this gap in research. Such research is necessary because organizations need to use practices with foundations in scientific evidence which are valid and legally defensible. This is critical for selection because employment decisions in the United States are subject to the Uniform Guidelines created by the Office of Personnel Management (OPM). The Uniform Guidelines outlines the required validity evidence which is admissible for showing that a selection procedure is appropriate and does not discriminate (OPM, 1979). Following these guidelines is imperative for the ability to defend selection procedures in the event of litigation. The Uniform Guidelines from OPM, are reflected in set principles published by the Society for
Industrial/Organizational Psychology (SIOP). These principles are used for the validation of selection procedures. These principles states that three types of validation evidence are acceptable to use for selection tools: “evidence of validity based on 10
relationships with measures of other variables, evidence based on content, and evidence based on the internal structure of the selection procedure” (Society for Industrial and
Organizational Psychology, 2003, p. 13). These guidelines emphasize the need for validation of selection measures and highlight the increased potential for litigation. Not following the guidelines as described creates a high potential for litigation and loss. The high risk of litigation for organizations makes validation of the many game-based assessments on the market is crucial. While there is some research evidence of validity of
GBA’s in education and on specific GBAs in the workplace, there is not a robust literature of validation evidence for GBAs currently. The current study aims to be a part of the greater body of research which is needed to minimize the risk of litigation in the high stakes avenues of selection and hiring.
GBA and Psychometric Measures
While the general characteristics which are being measured using GBAs was discussed briefly prior, some of the more common variables which are currently being measured using GBAs will be discussed below in more detail. Many of the attributes measured through GBAs currently have long been measured in selection contexts.
However, this new method of measuring these characteristics has strengths and weaknesses when compared to the older measures.
Cognitive Ability Tests. GBA’s have been designed to be comparable to many of the paper and pencil tests for psychometric constructs including cognitive ability, persistence, and personality measures (DiCerbo, Shute, & Kim, 2016; Kim & Shute, 11
2015; Montefiori, 2016). Cognitive ability tests (also referred to as general mental ability or intelligence tests) are a staple of personnel selection measures and have been found to have predictive validity with job performance at a .5 correlation (Hunter & Hunter, 1985;
Schmidt & Hunter, 1998). Research indicates these measures are consistent and valid, however they also come with consistent drawbacks. Despite the strong predictive validity, cognitive ability tests have long been known to be subject to inequality among racial and ethnic minority test takers (Hunter & Hunter, 1985; Meyer, 2014). This can lead to a phenomenon known as adverse impact which is defined in the Uniform
Guidelines as “a selection rate for any race, sex, or ethnic group which is less than four- fifths (or eighty percent) of the rate for the group with the highest rate” (OPM, 1917 p. 9).
Cognitive ability tests can lead to adverse impact by providing disproportionately higher scores for majority members than minorities (Hunter & Hunter, 1984). Despite the high predictive validity with job performance, cognitive ability tests produce high levels of adverse impact and lead to under-representation of minorities. This disproportionate treatment is not only an ethical dilemma but is also illegal. Organizations with adverse impact due to cognitive ability measures are likely to face legal ramifications and financial loss. However, GBAs may provide some alleviation to this issue with cognitive ability.
GBAs provide a unique benefit that may help mitigate the issues raised by using cognitive ability assessments. GBAs of cognitive ability have the potential to minimize the adverse impact of traditional cognitive ability measures due to the way they 12
determine applicant characteristics (Chamorro-Premuzic, 2016; Montefiori, 2016). GBAs of cognitive ability measure cognitive ability by measuring behavior instead of test questions which are susceptible to inequality. GBAs can also measure facets of intelligence such as reaction time and spatial ability (DiCerbo, 2014; Jaffal & Wloka,
2015). These measures of intelligence are less susceptible to adverse impact because they do not rely as heavily on reading level and vocabulary. The behaviorally based measurements of cognitive ability through GBAs may be able to not only provide estimates of a factor which is highly correlated with performance but do so in a way which also minimizes the effect of adverse impact. This possibility highlights a benefit of the new technology as well as creates more reasoning for an increased focused on researching GBAs in the workplace. In addition, GBAs are able to measure personality traits in an indirect manner which may mitigate some of the drawbacks of such assessments as well.
Personality Tests. Personality tests are a very common part of selection batteries
(Chamorro-Premuzic, 2016). The Big Five personality inventory (i.e., openness to experience, extraversion, agreeableness, conscientiousness, and emotional stability) is a personality test which is often used in recruitment and selection (Berrick & Mount,
1995). The Big Five personality test is often used in selection contexts because research has found conscientiousness to be a significant predictor of job performance (Berrick &
Mount, 1995; Ventura & Shute, 2013). However, like cognitive ability, personality measures are subject to a drawback in the traditional measurement. Due to the self-report 13
nature of the traditional measures, personality inventories are susceptible to participants faking “good” to appeal to the employer (Montefiori, 2016; Lowman, 2016). A number of meta-analyses have concluded that self-report surveys are particularly susceptible to candidate faking (Viswesvaran & Ones, 1999). This faking reduces the reliability and validity of selection measures and therefore limits their ability to accurately predict performance. GBAs may help mitigate this issue through measurement of personality through behavioral actions during the game. This minimizes the ability for applicants to respond to questions in a socially desirable manner (Ryan & Ployhart, 2014). In a recent study, participants were instructed to try and create the illusion of an assigned personality when completing a GBA. Participants had difficulty mimicking the qualities of the assigned personality from gameplay behaviors (Montefiori, 2016). This research suggests that GBAs may play a role in reducing faking in selection assessments. In addition to reducing faking, the personality measures which are contained inside of GBA’s do not involve asking participants direct questions which makes them feel more at ease with the personality inventories.
Applicant Reactions
Applicant reactions to selection measures can have an effect on the predictability and reliability of the tests. Although one of the most common selection tools, cognitive ability tests have been categorized as unfavorable in the perspectives of applicants
(Oostrom & De Soete, 2016). One reason for unfavorable applicant reactions is the long length of traditional assessments. Long surveys can lead to applicants experiencing 14
survey fatigue where applicants become unhappy and fatigued by the survey itself.
Survey fatigue can reduce the reliability and predictability of the selection measures. In a study of game-based assessments compared to traditional self-report measures, participants indicated that they perceived the GBA as better than self-report measures. In addition, participants considered companies who used the GBA more innovative than those which used traditional self-report measures (Montefiori, 2016).
Current Study
Validation is a key part in the process of developing a reliable and accurate selection tool (Schmidt & Hunter, 1998; Ventura & Shute, 2013). Practitioners need research to show whether these new tools will be able to provide the information claimed.
To use these tools, such proof is vital and will open GBA’s for more widespread adoption. In order for businesses and applicant to gain the benefits of GBA over traditional measures, there must first be evidence that these benefits are valid and testable. This becomes more important as the responsibilities and power of the positions being selected for increases. The current study focused on the validation of a GBA called
Skyrise City™. This GBA was evaluated for use as a measure of a vocational interests based off Holland’s (1959,1985) theory of vocational interests. This study evaluated the validity of the use of a new technological measure of the RIASEC interests by comparing the GBA results to that of established measure. In addition, the degree to which positive outcomes associated with job congruence could be predicted using this new measure was also evaluated. 15
Vocational Interests and Congruence
Applicants may be screened for how well they fit in the job they are applying to.
This fit, also referred to as congruence, is an indication of how closely an individual is suited to a potential position based on factors such as personality, vocational interests and skills. Vocational interests are the preferred types of work for an individual. Vocational interests can be measured using a set of codes which were developed by Holland (1959).
Holland’s codes are comprised of six different personality types: Realistic, Investigative,
Artistic, Social, Enterprising and Conventional (commonly referred to as RIASEC for short) (Holland, 1959, 1985; Nauta, 2010). Each individual is assigned a 2 or 3 letter code which corresponds to their strongest vocational preferences. The RIASEC types have been found to among a variety of different groups including high school students, college students, and working adults (Holland, 1985; Nauta, 2010). When this code is compared to the code of the participants’ occupation, the RIASEC scores are used as a measure of vocation congruence (Hutchinson, 2014). Vocational congruence has been linked to several work outcomes such as job satisfaction (Holland, 1985; Marcus &
Wagner, 2015; Nauta, 2010), performance (Nauta, 2010; Spokane, Meir, & Catalano) and turnover intention (Holland, 1985; Tracey, 2000).
Validation. As discussed prior, the goal of the current study was to evaluate the effectiveness of a game-based assessment in indicating an individual’s Holland code without directly asking them a series of questions like other RIASEC measures do
(Armstrong, Allison & Rounds, 2008; Holland, 1985). In order to test the validity of this 16
new method, three methods of measuring Holland’s codes will be compared using a multi-trait multi-method matrix (MTMM) (Campbell & Fiske 1959). The three methods include the GBA assessment, the Interest Item Pool Form A and Form B (Armstrong et al, 2008). The MTMM allows for the evaluation of construct validity through convergent and divergent validity (Campbell & Fiske, 1959). The MTMM matrix was chosen to evaluate validity in lieu of other analyses (e.g. CFA, factor analysis) because of the use of gameplay behaviors as measurement instead of traditional items which are needed to conduct these other methods. Although MTMM matrix is used less often in recent literature, the method provides a robust evaluation of convergent and divergent validity
(Campbell & Fiske, 1959). Recent research has found evidence for the continued reliability of using MTMM for analyzing validity for assessments which are used in human resource management (Hamdani, Valcea & Buckley, 2016).
Hypotheses 1. On average, the RIASEC code derived from the Game-Based
Assessment with be congruent with the score from the pen and pencil
assessments.
Congruence. The level of congruence between the vocational interests of an individual and an occupation adds a level of depth beyond the individual RIASEC code.
Holland (1985) ascertained that those with different combinations of RIASEC interest codes, or types, require different environments which match their skills, personality, and competencies. One such way of seeking out congruent environments is through career 17
choices. Due to this it is believed, that individuals will seek out careers in positions which match their interests if it is possible (Holland, 1959,1985). While not all positions available will be completely congruent with the individuals vocational interest, there is a degree of congruence between the job characteristics and the individual’s vocational interests. This level of congruence can be calculated using a measurement known as the
C Index (Brown & Gore, 1994; Eggerth & Andrew, 2006). The C index is calculated using the three highest Holland codes for an individual and an occupation. By weighting the codes based on the order, the C Index creates a congruence value which ranges from
0 (no congruence) to 18 (full congruence) (Eggerth & Andrew, 2006). In the present study, the C Index was used to assess the level of congruence for participants and their current occupation. For students, the level of congruence between participants and their preferred job was analyzed to account for the possible distance between vocational interest and job availability for the university students in our sample.
Hypothesis 2: On average, the RIASEC code derived from the game-based
assessment will be congruent with the code of the participants’ occupation.
Hypothesis 2a: If the participant is not in preferred role, the RIASEC code derived
from the game-based assessment will be congruent with the code of the
participants preferred occupation
Work Related Outcomes. As discussed prior, the level of congruence can have an
impact on work outcomes such as job satisfaction, turnover intention, and job 18
performance (Holland 1985; Nauta, 2010; Spokane, Meir, & Catalano). However, the existing research which found relationships between congruence and outcomes were performed using traditional measures of the Holland codes. In order to ascertain the validity of the GBA derived RIASEC codes and traditional measures, it is also important to see if the congruence level from the new measure predicts these work outcomes. In the current study the relationship between congruence (as measured through the C index) and participants job satisfaction, organizational commitment, and turnover intention was investigated.
Hypothesis 3: On average, higher levels of congruence will be related to positive
work outcomes.
Hypothesis 3a: On average, higher levels of congruence will be positively related
to job satisfaction.
Hypothesis 3b: On average, higher levels of congruence will be positively related
to job fit.
Hypothesis 3c: On average, higher levels of congruence will be negatively related
to turnover intention.
Method
Materials and Apparatus
As part of the study participants played the game-based assessment Skyrise City
™, which required a mobile device. In order to take part in the study, participants 19
downloaded the game from the Apple App Store or the GooglePlay store onto their mobile smartphone or tablet. Android devices required Android software version 4.1 and up; Apple devices required iOS 9.0 or up. All other survey measures were completed online through survey software.
Procedures
A within-subjects study was conducted on participants from a university and online sample. Participants first completed an online consent form, then downloaded the mobile application or were directed to a mobile device where they could play the game in the research lab. Participants were given an anonymized player key required to start the program. Participants played the game-based assessment which took approximately 50 minutes to complete. Once the assessment was finished, participants completed the online survey which included demographic questions and the measures listed above.
Measures
Demographics. Participants were asked to indicate their gender and ethnicity.
Participants were asked if they were students and if they were currently employed.
Students were asked what their year of study (e.g. freshman) and major was at the time of the study. Students were also asked why they chose their major.
Vocational Interests
Game-Based Assessment. A mobile game-based assessment created by Arctic
Shores was used to measure vocational interests. The game was developed to measure many facets of personality by recording paradata from participants as they play a game 20
on a mobile smartphone or tablet. Vocational interests were gathered from the game- based assessment Skyrise City™. Through gameplay, the GBA gathered personality information about participants and produced an estimate of the participants vocational interests based on Holland’s (1985) model. The GBA consisted of 8 different levels which involve completing a variety of tasks based on established research paradigms from many subdisciplines of psychology. Participants progressed through levels and gained rewards for completion of levels. The GBA took approximately 40 minutes to an hour for participants to complete.
Interest Item Pool. The Interest Item Pool (IIP; Armstrong et al., 2008) is shortened version of the 0*Net Interest Profiler (Lewis & Rivkin, 1999). The IIP was used to measure Holland’s (1985) RIASEC interests. The IIP consists of 40 work activities which participants indicate how they would feel about performing each activity on a 7-item Likert-type scale ranging from 1 (Dislike a great deal) to 7 (Like a great deal). The IIP results in a mean score for each of the Holland code areas (Realistic,
Investigative, Artistic, Enterprising and Conventional). The three largest of these means are combined in descending order to determine the RIASEC code for an individual (ex.
RIA). The alphas for each of the Holland codes ranged from .85 to .93.
Congruence
C index. The RIASEC codes derived between measures was evaluated using
Brown and Gore’s (1994) C index. The C index has a possible range of 0 (none) to 18
(perfect) (Brown & Gore, 1994). The C index evaluates the similarity of RIASEC profile 21
codes based on the hexagon structure outlined in Holland’s (1985) theory of vocational interests. The C index formula is C = 3(x) + 2(x) + (x), in which x is equal to 3,2,1 or 0 based on the distance between the codes in the hexagon. A 3 is assigned if the codes are the same, 2 is for adjacent types, 1 is for alternate types, and 0 is for opposite types. For example, an ISA individual with a SAE job environment would have the formula C =
3(1) + 2(2) +1(1) for a score of 8. Holland recommended the C index because it considers the order of codes and is consistent with the theory (Holland, 1985).
Outcomes
Job Satisfaction. Job satisfaction was measured with a single item: “All in all, I am satisfied with my job.” answered on a 5-item Likert-type scale ranging from 1
{Strongly Disagree) to 5 {Strongly Agree). In accordance with Wanous (1997), a one item job satisfaction measure was deemed to be sufficient for measuring this concept.
Job Fit. Job fit was measured using a Person-Job fit scale from Brkich, Jeffs, and
Carless (2002). The items were measured on a 5-item scale ranging from 1 {Strongly
Disagree) to 5 {Strongly Agree). An example item is “I find my current job motivating”.
The alpha reliability for this scale was .92.
Job Calling. The Calling and Vocation Questionnaire (CVQ) was used to measure the amount of calling participants felt towards their current career or their career as a whole if they are not employed in a job they feel is part of their career (such as university students who are working part-time). The scale from Dik, Eldridge, Steger and
Duffy (2012) consists of 24 questions responded to on a 4-item scale ranging from 1 {Not 22
at all true of me) to 4 (Absolutely true of me). An example item is “My career is an important part of my life’s meaning”. The alpha for this scale was .94.
Turnover Intention. Turnover intention was measured using a 3-item withdrawal intention scale from Cohen (1998) which was developed around the Mobley, Homer and
Hollingsworth (1977) model of employee turnover. The items measured whether participants were thinking of quitting, intention to search and intention to quit and were measured on a 5-item scale ranging from 1 {Strongly Disagree) to 5 {Strongly Agree). An example item from this scale is “I think a lot about leaving the organization”. The alpha for this scale was .92.
Job Type. Participants were asked to indicate their current job title. If their job title is not within their field or career objectives they were asked to also indicate their preferred job title. Participants were given categories and subsequent job titles which have known RIASEC codes so by selecting a job title, participants indicate which type of vocation they work in according to the Holland codes.
Video Game Experience (VGE). The Video Game Experience scale from
Sanchez (Sanchez, Langer & Herring, 2018) was used to measure how much experience with video games the participants had. This was used to evaluate if level of experience with video games had an effect on the results of the game-based assessment. The scale is measured on a 5-item Likert type scale ranging from 1 (Strongly Disagree) to 5 {Strongly
Agree). An example item from this scale was “I spend many hours each week playing video games”. The alpha reliability for this scale was .95. 23
Results
Participants
Participants were recruited from a pool of students in a Western public university in the United States. Permission was obtained from the Institutional Review Board, 36
(22.2% male, 77.8% female) students participated in the study. The mean age of these student participants was 26 years (SD = 8 years). Of these students, 3.6% reported being enrolled as Freshmen, 6.4 % as Sophomores, 14% as Juniors, 14.3% as Seniors and
20.7% as Graduate Students; 27.8% self-identified as Asian, 27.8% as Hispanic or
Latino, and 33.33% as White/Caucasian. Due to the slow recruitment of student participants, a second sample of participants were recruited from an online research sourcing tool Prolific. Through this recruitment platform, 104 (51.9% male, 45.2% female) individuals participated in the study. The mean age of these online participants was 30 years (SD =10 years) with a range from 18 to 61 years. Of these participants,
2.9% self-identified as African American, 8.7% as Hispanic or Latino, 12.5% as Asian, and 69.2% as White/Caucasian. A summary of participant distribution for both samples is provided in Table 2.
Table 2
Demography Characteristics Total University Prolific Variable («= 140) (n =36) (n = 104) n (%) n (%) n (%) Gender Female 75 (53.6%) 28 (77.8%) 47 (45.2%) 24
Total University Prolific Variable (n = 140) (n =36) (n= 104) n (%) n (%) n (%) Male 62 (44.3%) 8 (22.2%) 54 (51.9%)
Other 2(1.4%) 0 (0.0%) 2(1.9%)
Unknown 1 (0.7%) 0 (0.0%) 1 (1.0%)
Employment Status
Employed Full-Time 55 (39.3%) 1 (2.8%) 54 (51.9%)
Employed Part-Time 24(17.1%) 3 (8.3%) 21 (20.2%)
University Student Not Employed 30(21.4%) 18(50.0%) 12(11.5%)
University Student Employed 31 (22.1%) 14 (38.9%) 17 (16.3%) Ethnicity
Asian 23 (16.4%) 10(27.8%) 13 (12.5%) Black, African American 3 (2.1%) 0 (0.0%) 3 (2.9%)
Hispanic or Latino/a 19(13.6%) 10(27.8%) 9 (8.7%)
Native Hawaiian/Pacific Islander 1 (.7%) 1 (2.8%) 0 (0.0%)
Two or More Races 8 (5.7%) 3 (8.3%) 5 (4.8%) White or Caucasian 84 (60%) 12 (33.3%) 72 (69.2%)
Other/Non-Disclosed 2(1.4%) 0 (0.0%) 2(1.9%)
Age M(SD) M(SD) M (SD) 28.4 (9.7) 25.5 (8.1) 30.0(10.0) Study Correlations
Before testing the hypotheses, we reviewed the correlations between the continuous demographic variables (i.e., age and video game experience) along with scores from the outcomes variables examined in the study. Due to having a large number 25
of variables in this study (e.g., six variables for each of the vocational interest measures) correlations between variables that were measured for specific hypotheses are presented in tables later in the results. For the first demographic measure, age was significantly related to turnover intention (r = -.20, p = .02), job fit (r = .25, p = .003), and job satisfaction (r — .24, p = .004), demonstrating that in our sample, older participants tended to report less intention to leave their current job, better fit with their current job and higher satisfaction with their current job. This makes sense given that most individuals tend to move more towards their ideal career as they grow older. Being closer to their ideal career would reasonably be associated with experiencing positive reactions to their job (i.e., fit and satisfaction) and feel less inclined to leave that position.
For the second demographic measure, video game experience had a significant relationship with one variable, job calling (r = -.18,/> = .03). Thus, a participant’s video game experience was related to their sense of calling felt for their current position where people with more video game experience reported lower levels of job calling. One possible explanation for this finding is that video games are often viewed as a leisure activity. Those who are at a stage in their life where they don’t perceive themselves to be in their career of choice (i.e., lower sense of calling towards their current career) may also have more time for leisure activities (e.g., students, part time workers). Although the same could be said for those nearing the end of their career such as individuals facing retirement, there was not a large representation of late career individuals in the study
samples. 26
Table 3
Correlations o f Demographics and Work Outcomes (n = 140)
Variables 1 2 3 4 5 6
1. Age -
2. Video Game Experience -.12 - * N ( o i* 3. Tumover Intention .11 -
4. Job Fit .25** -.10 -.78** -
5. Job Calling -.03 -.18* .02 .18* -
6. Job Satisfaction .24** -.07 -.67** .88** .21* -
7. GBA RIASEC Code .06 -.16 -.01 -.03 -.07 .02
* Correlation is significant at the 0.05 level (2-tailed). ** Correlation is significant at the 0.01 level (2-tailed). GBA = Game-based assessment, RIASEC — vocational interest code (realistic, investigative, artistic, enterprising, conventional.
There were also several significant relationships between the study outcome variables, with job fit being significantly related to turnover intentions (r = -.78, p =.00), job calling (r = .18,/? =.03), and job satisfaction (r = .88, p =.00), and job satisfaction further being significantly related to turnover intentions (r = -.67, p < .00) and job calling
(r = .21, p = .01). While these correlations were significant, an important note to make is that the relationships were in different directions with different strengths indicating that these outcomes are likely not measuring the same construct. These findings are provided
in Table 3. 27
Hypotheses Testing
Hypothesis 1 stated that the game-based assessment would produce RIASEC codes similar to those produced by the self-report measures. Findings demonstrated that the game-based format of the vocational interest measure did not converge with either self-report version. To evaluate the validity of the GBA as a measure of vocational interest, convergent and discriminant validity was analyzed using a MTMM matrix. The
MTMM matrix, shown in Table 4, displays the correlations for the six Holland codes (R,
I, A, S, E, C) across the three measures of vocational interests. As seen in Table 3, none of the scores for the codes converged with the self-report 1 (IIP-A) or self-report 2 (IIP-
B) respectively; Realistic (r = .08; r = .06), Investigative (r = .08; r = .15), Artistic (r =
.01; r = -.01), Social (r = .10; r = -.06), Enterprising (r = .01; r = -.04), and Conventional
(r = .13; r = -.04). Discriminant validity (mono-trait correlations greater than hetero-trait
correlations) was also not found for the GBA. In contrast to our expectation, the two
significant correlations found between the GBA method and the traditional self-report 1 method were between different traits, R-C (r = A9,p = .03) and S-A (r = .23,p = 005).
Thus no evidence was found that the GBA form of vocational interests functions
similarly to the self-report measures of vocational interest. 28
Table 4
Multi-trait Multimethod Matrix for Vocational Interest Measures (n - 140) Variables Game-Based Self-Report Self-Report 2 GBA R - I .11 - A -.10 .12 - S .24 .16 .43 E .00 .01 .00 .17 C .09 .10 -.26 -.15 .11 SRI O' R .08 .04 -.07 -.04 .11 .03 CO£ I -.09 J8 .02 .06 -.03 -.04 .14 (.88) A -.03 .04 .01 .23 -.04 -.02 .17 .51 (.85) S -.05 -.11 -.09 JO .08 .06 .08 .37 .29 (.87) E -.08 .12 -.06 -.01 =01 .08 .19 .25 .34 .32 (-87) C .19 .11 -.14 -.07 .04 .13 .32 .06 -.04 .13 .40 SR 2 R M .11 -.14 -.13 .04 .08 .75 .15 .18 .06 .40 .38 (.89) I -.14 -15 .05 .06 -.01 .05 .15 .73 .34 .35 .42 .19 .20 (89) A -.14 .07 -.01 .16 -.10 .03 .19 .43 84 .20 .34 -.03 .23 .39 (85) S .00 -.03 -.06 .06 .13 .01 .05 27 .23 83 .26 .07 .10 .31 .19 (-90) E -.12 .08 .03 .01 .04 .05 .18 .25 .35 .27 .78 .38 .38 .33 .33 26 (.87) C .08 .08 -.12 -.10 .04 .12 .24 .02 -.09 .08 .33 .86 .31 .21 -.05 .05 .39 (.93)
Note: Correlations at or above r - .17 are significant at .05; at or above r =.23 are significant at .01 Convergent validity diagonals are bolded, convergent validity diagonals for GBA and traditional methods are bolded and underlined Cronbach's alpha for the scales are along the diagonal in parentheses R= Realistic, /= Investigative, A-Artistic, S-Social, E-Enterprising, C-Conventional 29
Hypothesis 1 was furthered evaluated for congruence (i.e. how similar the
RIASEC codes were between the GBA and self-report scales) by analyzing the level of congruency between the codes derived from each measure. Congruence between the
RIASEC codes derived between measures was evaluated using Brown and Gore’s (1994)
C Index. Means and standard deviations for the C index are reported in Table 4. The range for the congruence between the GBA and both self-report measures was 0-18. This range indicated that for some participants the GBA and the self-report scales generated the exact same RIASEC code which would result in a C index score of 18. The level of congruence between the GBA derived RIASEC codes and the self-report scales was moderate (mean greater than 6, but less than 13), with the GBA reporting higher congruence with the IIP-A (M= 9.07, SD = 3.64) than the IIP-B (M= 8.68, SD = 3.93).
However, this difference was non-significant (/(l39) =1.11 ,p = .27). The two self-report scales were more congruent with each other than the GBA, but the congruence was not above the baseline (M of 13 or more) for high congruence (M= 11.70, SD = 4.25). This evaluation of congruence was consistent with the findings demonstrated by the MTMM, the moderate mean level of congruence between GBA and self-report scales is not sufficient to determine construct validity of the new measure. Hypothesis 1 was not supported by the MTMM as evidence did not support construct validity between the
GBA and traditional self-report RIASEC measures and the GBA did not produce similar scores to the self-report measures. 30
Table 5
Congruence of RIASEC Codes (n = 140)
Measures M SD Range
C index
GBA-IIP-A 9.19 3.58 (0-18)
GBA-IIP-B 9.10 3.70 (0-18)
IIP-A-IIP-B 9.88 4.09 (0-18)
Note. Note: C index scores (>6 = low, 6-13 = moderate, 13+= high congruence). GBA = Game-based assessment, IIP-A Interest Item Pool Version A, IIP-B = Interest Item Pool Version B (Armstrong et al., 2008)
Hypothesis 2 stated that the GBA would produce RIASEC codes for individuals which were congruent with the participants’ current occupation. Congruence was also measured using the C index to determine the level of similarity between the RIASEC code for the individual and the code for the current job type. Findings demonstrated that the GBA and self-report measures produced results which differed from the participants current occupations at a similar level. Means and standard deviations for the congruence scores between each of the measures of vocational interest are included in Table 6. The average C index for the GBA and job was again moderate (M= 9.10, SD = 3.79). Similar means were found for the self-report measures and the job codes; IIP-A (M= 9.19, SD =
3.58), IIP-B (M= 9.88, SD = 4.09). Although the measures did were not congruent to the extent that was predicted (i.e., all measures of congruence were below the established baseline of M- 13), on average all three scales moderately matched the RIASEC job 31
codes. This indicates that all three scales were able to predict the RIASEC of the
participants’ jobs generally to the same extent. The results demonstrate partial support for
Hypothesis 2 as there was a moderate level of congruence between the GBA code and job
code. In addition to this, the difference between the GBA vocational interest codes and job interest codes were similar across all the measures of vocational interest measured.
The mean level of congruence between the GBA and job codes was not significantly
different from the job congruence of the IIP-A (t (139) =.195,/? = .85) or the IIP-B
(/(139) =-1.51, p = . 13).
Table 6
Congruence o f Individual and Occupational Codes (n = 140)
Measures M SD Range
C index
Job-GBA 9.19 3.58 (0-18)
Job-IIP-A 9.10 3.70 (0-17)
Job-IIP-B 9.88 4.09 (0-18)
Note: GBA - Game-based assessment, IIP-A ^Interest Item Pool Version A, IIP-B = Interest Item Pool Version B (Armstrong et al., 2008)
Hypothesis 3 stated that greater congruence between individual RIASEC codes
and job codes, would be significantly related to work outcomes. The expectation was that
individuals who were more closely aligned between the assessments indication of their
ideal job and their current job would also be significantly related to positive work 32
experiences. Findings indicated that the level of congruence between the GBA and job codes were not related to greater levels of positive work outcomes. Across the vocational interest measures, there was no evidence of congruence and work outcomes being related.
As displayed in Table 7, correlations between work outcomes and congruence were small for all three measures. Only one relationship between congruence and outcomes was significant: congruence level between job and IIP-A codes was positively related to job fit (r = .22, p = .01). Predictive validation was not found for these measures, however the patterns of correlations across measures indicate that the GBA performed similarly to both self-report measures as was seen with hypothesis 2. Hypothesis 3 was not supported because level of congruence between job and GBA did not produce significant relationships with higher levels of turnover intention, job fit, job calling or job
satisfaction. The job congruency of the vocational interest measures was not strongly
related to outcomes, but the different methods resulted in similar patterns of non
significant relationships with these variables.
Despite the lack of support for the study hypotheses we were greatly encouraged
by the results which demonstrated similar behavior between the GBA measure and self-
report measures of vocational interest. We discuss this similarity and the implications this
may have for future research. We also further explore what other factors may help
explain these results. 33
Table 7
Congruence Correlations with Work Outcomes (n = 140)
Variables 1 2 3 4 5 6
1. Turnover Intention -
2. Job Fit -.78** -
3. Job Calling .02 .18* -
4. Job Satisfaction -.67** .88** .21* -
5. C index Job-Game -.10 .04 .00 -.01 -
6. C index Job-IIP-A -.10 .22* .00 .15 -.02 -
7. C index Job-IIP-B -.10 .20 .10 .13 .00 .37**
**Correlation is significant at the 0.05 level (2-tailed). ** Correlation is significant at the 0.01 level (2-tailed). GBA = Game-based assessment, IIP-A =Interest Item Pool Version A, IIP-B = Interest Item Pool Version B (Armstrong et al., 2008)
Discussion
In this study, the validity of a vocational interests as measured by a GBA was
compared to vocational interests which were measured through self-report measures. This was evaluated by student and online sample of participants who completed both the GBA and established measures of vocational interests (Armstrong et al., 2008). The scores congruency of the two types of measures was evaluated by MTMM matrix and through the C index measure of congruency of vocational interest scores from Brown and Gore 34
(1994). The majority of hypotheses tested were not supported, however partial support for some of the hypotheses tested led to more information about how GBAs should be conceptualized and validated.
Hypothesis 1 tested whether the GBA measure of vocational interests produced the same results as two traditional self-report measures. Convergence between the two types of measures was not found. This absence of convergence indicates that the different methods of assessment are not measuring the same things and therefore cannot be viewed as parallel forms. This indicates the GBA cannot be used in the same way at the self- report measures of vocational interest and be expected to perform in the same way.
However, GBAs have different characteristics which make it unrealistic that they would function as a replacement of the self-report measures, but this does not indicate that
GBAs are not useful. Even though the GBA does not behave the same as the self-report measures, it is important to investigate how the GBA does perform and how this different performance might still be useful.
One key aspects and benefits of GBAs is that measurement is made through behavioral actions rather than self-report of items. Self-report assessments have been studied in regard to their susceptibility to biased responses due to participants responding in accordance with social desirability (Moorman & Podsakoff, 1992). This is of concern for organizational research as participants are motivated in high stakes situations such as selection to respond as they believe the organization wants them to respond (Donaldson
& Grant-Vallone, 2002; Ryan & Ployhart, 2014). As GBAs do not measure responses 35
through self-report, it is likely that there will be differences in scoring when the ability to respond according to social desirability is stripped away (Landers, 2015). This difference in measurement may limit the ability to find convergence between self-report and GBA measures in the current study as traditional measures of vocational interests are based on self-report measures (Armstrong et al., 2008; Holland, 1985; Rounds, Su, Lewis, &
Rivkin, 2010).
Holland’s (1985) theory of vocational interests began with the use of self-report to determine such interests, however, the GBA used in the current study determined vocational interests through the measurement of personality variables. This is consistent with Holland’s theory in which he described vocational inventories as personality inventories (Holland 1959,1985). Holland referred vocational interests as “the expression of personality in work” (Holland, 1985, p 7). In line with this conceptualization of vocational interests, research has focused on how the Holland codes align with personality inventories such as the big five (Armstrong & Anthoney, 2009; Costa,
McCrae, & Holland, 1985; Nauta, 2010; Wille, De Fruyt & Kozlowski, 2014). Therefore, the use of personality measures to extract vocational interest profiles may produce different results than the traditional method of self-report but is still in line with
Holland’s original conception of vocational interests as expressions of personality. In this way, the GBA may produce vocational interests which are stripped of social desirability of any type of work and in-line with the personality of the individual which may explain the partial support found for hypothesis 2. 36
Hypothesis 2 investigated the ability of the GBA to produce vocational interest profiles similar to that of the participants’ current job environment. On average, the GBA vocational interest profiles were moderately congruent with those of the job.
Interestingly, the traditional self-report measures also produced moderate levels of congruence between individual and job RIASEC profiles. Neither measure of vocational interest produced high levels (above 13 on the C index) of congruence with the job profiles. Despite the lack of convergence in hypothesis 1 the GBA and traditional measures behaved similarly in their level of congruence with vocational interests of the environment. This finding highlights the issue of using the self-report measures as the gold standard by which to evaluate the validity of the GBA. In the absence of a true measure of vocational interest, the comparison was made using established measures.
However, all samples have error which can affect results. This includes the samples used to develop the established measures of vocational interest. Both the traditional and GBA measures feature some degree of error when measuring vocational interests, but the similarity of the job congruence among these measures indicates that the GBA may still be a useful measure of this construct. Landers (2015) posited that many different forms of validation evidence are needed to converge for the validity of a GBA to be determined. In this study the evidence gathered from MTMM matrix was not sufficient to show validity, but the similarity of the job congruence indicates that the measure is behaving similarly to the current practice. Further evidence for the similarity between the GBA and 37
established measures was found when evaluating the relationship between congruence and work outcomes.
Hypothesis 3 evaluated whether the level of job congruence of the GBA derived
RIASEC profiles was related to job satisfaction, job calling, turnover intention and job fit. Support was not found for this hypothesis as the level of congruence between the
GBA and job was not significantly related to the work outcomes measured. Except for one significant relationship between the job congruence of IIP-A and job satisfaction, level of congruence for both types of measures was not related to potential outcomes.
However, these findings reinforce that of hypothesis 2, as the new measure performed
similarly to the self-report measures. Two possible explanations arise for the lack of relationships found. The first is that the absence of significant relationships may indicate that level of congruence between individual and job interest codes is not related to work
outcomes. However, such relationships have been documented at length in the person-job
fit literature (Marcus & Wagner, 2015; Spokane, Meir, & Catalano, 2000; Tranberg,
Slane & Ekeberg, 1993; Wille, Tracey, Feys & De Fruyt, 2014). The second possible explanation is that in this sample, student and online participants may not have been as motivated and reliable as intended. However, research into the motivation and validity of
student (Greenberg, Gordon, Slade & Schmitt, 1987; Landers & Behrend, 2015; Ward,
1993) and online samples (Behrend, Sharek, Meade, & Wiebe, 2011; Buhrmester,
Kwang, & Gosling, 2011; Peer, Brandimarte, Samat, & Acquisti, 2017) have shown the
usefulness of these convenience samples for research purposes. 38
Any of these explanations are possible, as well as other explanations which have not been accounted for. It is most likely that components of each of these affected the
findings of the current study. The samples may have inherently had differences which limited the ability to find conclusive results at the outset. In addition, the measures used have at least some amount of error which then effects the results and conclusions which can be determined.
Limitations
The current study was limited by the small sample size which was obtained.
Larger samples may be able to minimize the effect of error in sampling or measurement and allow for more confidence in the conclusions from the analyses. In addition, the use
of student population which contained mostly psychology majors may have affected the results. This is also true of the use of an online crowdsourced sample, which may have
unknown variations in the population which affected the results. The current sample also
featured a majority of Caucasian participants and as such may not be as generalizable to a
more diverse population.
Implications
The findings from the current study support the notion that there is value in the use of GBAs. The GBA performed similarly to current measures used to determine vocational interests in the ability to produce individual results congruent with the job
environment and in the relationships found between congruence and outcomes. However, 39
the lack of convergence of the MTMM matrix indicates that the GBA cannot be used as a parallel form of the self-report measures. The use of the GBAs need to be re conceptualized from a replacement of traditional methods which would require validity evidence in line with a parallel form. Instead, GBAs need to be used in a way which references their similarities and differences in performance in comparison to the traditional measures. In line with the differences in performance, validation efforts need to take into consideration that these measures do not perform the same as self-report measures.
Current validation strategies may not be the best suited for evaluating these new assessments and a greater shift may be needed to develop a method which is better for evaluating GBAs. Although this kind of change may seem daunting and unlikely, such a change may parallel events which have already occurred in the field of psychological research. In the 19th and early 20th centuries, there was a divisive argument among academics in regard to the use of Bayesian statistics (Zyphur & Oswald, 2012). The methods were often ignored or directly denounced as ineffective (Andrew & Baguely,
2013; Pearson, 1920). However, Bayesian statistics have since experienced a shift in attention and acceptance which runs counter to the history of the methods in psychological research (Andrews & Baguely, 2013; Zyphur & Oswald, 2012). This drastic change could relate to a future change in the guidelines for establishing validity of new measures such as GBAs. The well-established methods which are currently in use 40
may prove ineffective for the new assessments and a new validation strategy may better aid in the development and evaluation of such measures.
However, before such validation strategies develop, it remains important that there is validation of these new assessments which are currently in use in organizations.
The current study builds upon the small body of research into the validity of GBAs and gives further evidence that more validation efforts are needed. The lack of convergence and the similarity of patterns of results leaves issues of validity and use of GBAs open for future inquiries. 41
References
Andrews, M., & Baguley, T. (2013). Prior approval: The growth of Bayesian methods in
psychology. The British Journal of Mathematical and Statistical Psychology,
66(1), 1-7.
Armstrong, P., & Anthoney, S. (2009). Personality facets and RIASEC interests: An
integrated model.” Journal o f Vocational Behavior, 75(3), 346-359.
Armstrong, P., Allison, W., & Rounds, J. (2008). Development andinitial validation of
brief public domain RIASEC marker scales. Journal o f Vocational Behavior,
73(2), 287-299.
Arthur, W. (2017). The impact o f emerging technologies on selection models and
research: Mobile devices and gamification as exemplars. In J. L. Farr, & N. T.
Tippins (Eds.), Handbook of employee selection (2nd ed., pp. 967-986). New
York: Taylor & Francis/Psychology Press.
Barrett, P. (2018). The EFPA test-review model: When good intentions meet a
methodological thought disorder. Behavioral Sciences, 8(1), 5.
Behrend, T., Sharek, S., Meade, D., & Wiebe, J. (2011). The viability of crowdsourcing
for survey research. Behavior Research Methods, 43(3), 800-813.
Barrick, M., & Mount, M. (1991). the big five personality dimensions and job
performance: a meta-analysis. Personnel Psychology, 44(1), 1-26. 42
Brown & Gore. (1994). An evaluation of interest congruence indices: Distribution
characteristics and measurement properties. Journal o f Vocational Behavior,
45(3), 310-327.
Brkich, M., Jeffs, D., & Carless, S. A. (2002). A global self-report measure of person-job
fit. European Journal o f Psychological Assessment, 18(1), 43-51.
doi: 10.1027//1015-5759.18.1.43
Buhrmester, M., Kwang, T., & Gosling, S. (2011). Amazon's mechanical turk: A new
source of inexpensive, yet high-quality, data? Perspectives On Psychological
Science, (5(1), 3-5.
Chamorro-Premuzic, T. (2016). New talent signals: Shiny new objects or a brave new
world? Industrial and Organizational Psychology., 9(3), 20.
Campbell, D., & Fiske, D. (1959). Convergent and discriminant validation by the
multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105.
Chapman, D. S., & Webster, J. (2003). The use of technologies in the recruiting,
screening, and selection processes for job candidates. International Journal of
Selection & Assessment, 11(2/3), 113-120. doi: 10.1111/1468-2389.00234
Cohen, A. (1998). An examination of the relationship between work commitment and
work outcomes among hospital nurses. Scandinavian Journal of Management,
14( 1), 1-17.
Collmus, A., Armstrong, M., & Landers, R. (2016). game-thinking within social media to
recruit and select job candidates. Social media in employee selection and 43
recruitment: theory, practice, and current challenges, 103-124. 10.1007/978-3
319- 29989-1_ 6.
Costa, PT., Jr., McCrae, R.R., & Holland, J.L. (1984). Personality and vocational
interests in an adult sample. Journal o f Applied Psychology, 69, 390-400.
Cut-e (2017). White Paper: Ahead o f the game. Best practice in games, gamification and
game-based assessment. Retrieved from: https://www.cut-e.com/online
assessment/gamification-in-recruitment/httpwwwcut-ecomgamification-white
paper/
Dale, S. (2014). Gamification. Business Information Review, 31(2), 82-90.
DiCerbo, K. E. (2014). Game-Based Assessment of Persistence. Journal O f Educational
Technology & Society, 17( 1), 17-28.
DiCerbo, K., Shute, V., & Kim, Y. (2016). The future of assessment in technology-rich
environments: Psychometric considerations. Learning, Design, and Technology,
1-21.
Dik, Bryan J., Eldridge, Brandy M., Steger, Michael F., & Duffy, Ryan D. (2012).
Development and validation of the calling and vocation questionnaire (CVQ)
and brief calling scale (BCS). Journal o f Career Assessment, 20(3), 242-263.
Donaldson, S., & Grant-Vallone, I. (2002). Understanding self-report bias in
organizational behavior research. Journal of Business and Psychology, 17(2),
245-260. 44
Hamari, J., Koivisto, J., & Sarsa, H. (2014). Does gamification work? A literature review
of empirical studies on gamification. System Sciences (H1CSS), 2014 47th Hawaii
International Conference on, 3025-3034.
Hamdani, M., Valcea, S., & Buckley, M. (2016). The MTMM matrix approach:
Implications for HRM research. Personnel Review, 45(6), 1156-1175.
Holland, J.L. (1959). A theory of vocational choice. Journal o f Counseling Psychology,
6(1), 35-45.
Holland, J.L. (1985). Making vocational choices: A theory of vocational personalities and
work environments (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall.
Hunter & Hunter (1984). Validity and utility of alternative predictors of job performance.
Psychological Bulletin, 96(1), 72-98.
Hutchinson, T. (2014). Assessing the congruence of worker and workplace using the
proximities of their RIASEC types. Australian Journal of Career Development,
23(3), 133-139.
Jaffal, Y., & Wloka, D. (2015). Employing game analytics techniques in the
psychometric measurement of game-based assessments with dynamic content.
Journal o f E-Learning and Knowledge Society, 77(3), 101-115.
Greenberg, J., Gordon, M., Slade, L., & Schmitt, N. (1987). The college sophomore as
guinea pig: setting the record straight/student guinea pigs: Porcine predictors
and particularistic phenomena. Academy o f Management. The Academy o f
Management Review, 72(1), 157. 45
Kim, Y., Almond, R., & Shute, V. (2015). Applying evidence-centered design for the
development of game-based assessments in physics playground. International
Journal o f Testing, 1-22.
Kim, Y. & Shute, V. (2015). The interplay of game elements with psychometric qualities,
learning, and enjoyment in game-based assessment. Computers & Education, 87,
340-356.
Landers, R. (2014). Developing a Theory of Gamified Learning: Linking Serious Games
and Gamification of Learning. Simulation & Gaming., 45(6), 768.
Landers, R. N. (2015). An introduction to game-based assessment: Frameworks for the
measurement of knowledge, skills, abilities and other human characteristics using
behaviors observed within videogames. International Journal o f Gaming and
Computer-Mediation Simulations, 7(4), iv-viii.
Landers, R., & Behrend, T. (2015). An inconvenient truth: arbitrary distinctions
between organizational, mechanical turk, and other convenience samples.
Industrial and Organizational Psychology, 8(2), 142-164.
Lowman, G. H. (2016). Moving beyond identification: Using gamification to attract and
retain talent. Industrial and Organizational Psychology, 9(3), 677-682.
doi:http://dx.doi.org.jpllnet.sfsu.edu/l 0.1017/iop.2016.70
Marcus, B., & Wagner, U. (2015). What do you want to be? criterion-related validity
of attained vocational aspirations versus inventoried person-vocation fit.
Journal o f Business and Psychology, 50(1), 51-62. 46
McCarthy, J., Bauer, T., Truxillo, D., Anderson, N., Costa, A., & Ahmed, S. (2017).
Applicant perspectives during selection: A review addressing “so what?,” “what’s
new?,” and “where to next?”. Journal of Management, 43(6), 1693-1725.
Mislevy, R. J., Oranje, A. I., Bauer, M., DiCerbo, K., Von Davier, A., John, M., &
Corrigan, S. (2015). Psychometrics and game-based assessment. In Technology
and Testing: Improving Educational and Psychological Measurement (pp. 23-48).
Taylor and Francis.
Montefiori, L. (2016). Game-based assessment: Face validity, fairness perception, and
impact on employer’s brand image. Assessment & Development Matters, 8(2),
19-22.
Moorman, R., & Podsakoff, P. (1992). A meta-analytic review and empirical test of the
potential confounding effects of social desirability response sets in organizational
behaviour research. Journal o f Occupational and Organizational Psychology,
65(2), 131-149.
Morelli, N., Mahan, R., & Illingworth, A. (2014). Establishing the measurement
equivalence of online selection assessments delivered on mobile versus
nonmobile devices. International Journal o f Selection and Assessment, 22(2),
124-138.
Nauta, M. M. (2010). The development, evolution, and status of Holland’s theory of
vocational personalities: Reflections and future directions for counseling 47
psychology. Journal o f Counseling Psychology, 57(1), 11-22.
doi:10.1037/a0018213
Nieborg, D. B. (2004), America's Army: more than a game, transforming knowledge into
action through gaming and simulation. Ed. & Thomas Eberle Willy Christian
Kriz. Munchen:SAGSAGA
Oostrom, J. K., & De Soete, B. (2016). Ethnic Differences in perceptions of cognitive
ability tests: The explanatory role of self-serving attributions. International
Journal o f Selection and Assessment, 24(\), 14-23.
https://doi.org/10-l 11 l/ijsa.12125
Pearson, K. (1920). The fundamental problem of practical statistics. Biometrika, 13( 1),
1-16.
Peer, Brandimarte, Samat, & Acquisti. (2017). Beyond the turk: alternative platforms for
crowdsourcing behavioral research. Journal o f Experimental Social Psychology,
70, 153-163.
Rounds, J., Su, R., Lewis, P., & Rivkin, D. (2010). 0*NET Interest Profiler Short Form
Psychometric Characteristics: Summary and Supporting Evidence. Department of
Labor 0*NET Resource Center
Rupp, A. a., Gushta, M., Mislevy, R. J., & Shaffer, D. W. (2010). Evidence-centered
design of epistemic games: Measurement principles for complex learning
environments. The Journal o f Technology Learning and Assessment, 5(4), 3-41. 48
Ryan, A. M., & Ployhart, R. E. (2014). A century of selection. Annual Review of
Psychology, 65,693-717. doi:http://dx.doi.org.jpllnet.sfsu.edu/10.1146/annurev
psych-010213-115134
Ryan R. M., Rigby C. S, & Przybylski A. K. (2006). The motivational pull of video
games: A self-determination theory approach. Motivation and Emotion.
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in
personnel psychology: Practical and theoretical implications of 85 years of
research findings. Psychological Bulletin, 124(2), 262-274. doi:l0.1037/0033
2909.124.2.262
Shute, Wang, Greiff, Zhao, & Moore. (2016). Measuring problem solving skills via
stealth assessment in an engaging video game. Computers in Human Behavior,
63,106-117.
Spokane, Meir, & Catalano. (2000). Person-environment congruence and Holland's
theory: A review and reconsideration. Journal o f Vocational Behavior, 57(2), 37
187.
Susca, M. (2012). Why we still fight: adolescents, america's army, and the government
gaming Nexus. Global Media Journal, 12,1-16.
Tracey, T. (2007). Moderators of the interest congruence-occupational outcome
relation. International Journal for Educational and Vocational Guidance, 7(1),
37-45. 49
Tranberg M, Slane S, Ekeberg SE. (1993). The relation between interest congruence
and satisfaction: A metaanalysis. Journal o f Vocational Behavior, 42, 253-264
Ventura, & Shute. (2013). The validity of a game-based assessment of persistence.
Computers in Human Behavior, 29(6), 2568-2572.
Wanous, J.P. (1997) Overall job satisfaction: How good are single-item measures?
Journal o f Applied Psychology, 82(2), 1997, pp. 247-252.
Ward, E. (1993). Generalizability of psychological research from undergraduates to
employed adults. The Journal of Social Psychology, 133(4), 513-519.
Wille, B. & De Fruyt, F. (2014) “Vocations as a source of identity: Reciprocal relations
between big five personality traits and RIASEC characteristics.” Journal of
Applied Psychology.
Wille, B., Tracey, T., Feys, M., & De Fruyt, F. (2014). A longitudinal and multimethod
examination of interest-occupation congruence within and across time. Journal
o f Vocational Behavior, 84(1), 59-73.