Model-based Reliability and Validity of Measurement Models

Using Structural Equation Modeling

Leila Karimi

Ph.D

2015

To

“The hands of my mum who showed me how the impossible

can become possible with hard work, and the big heart of my dad who always gave me the courage

to follow my dreams”

ABSTRACT

Structural Equation Modeling (SEM) has a long and interesting history and it continues to evolve, providing exciting research opportunities. This study considers the roots of SEM and model-based reliability and the developments in these areas in the context of measurement models. Looking to the future, the research provides important new applications of model-based reliability in bifactor models using Covariance-based SEM

(CB-SEM) and reflective-formative measurement models, using Partial Least Squares SEM

(PLS-SEM). In addition the application of Bentler’s covariate-dependent reliability for reliability assessments and demonstrating common method bias are demonstrated for the first time.

The thesis considers three important research studies involving work ability, work organisational assessment and a survey of social desirability, wellness, drinking habits and emotional intelligence. These studies are used to demonstrate the above new developments in model-based reliability. The contribution of the research and directions for future research are discussed separately for each study and in general.

1

ACKNOWLEDGEMENTS

I would like to express my sincere appreciation to my supervisor and mentor,

Associate Professor Denny Meyer, for her continual support, encouragement, patience and kindness throughout the life of this PhD. I would also like to extend my gratitude to

Professor Peter Bentler, University of California, Los Angeles (UCLA) who introduced me to a new era of research.

Thank you also to my associate supervisors Professor Philip Taylor from Monash

University and Associate Professor Christine Critchley from Swinburne University for their support through different stages of this project. Thanks to the Business, Work & Ageing

Centre for Research (BWA) at Swinburne University for sharing the database on WAS and

Dr Jodi Oakman from La Trobe University for generously sharing the paramedics data, Dr

James Gaskin from Brigham Young University, Professor Joerg Henseler from the

University of Twente and Professor Christian M. Ringle from the Hamburg University of

Technology (TUHH) for introducing me to the world of Partial Least Squares (PLS).

Special thanks to my lovely family and friends, particularly my brothers ‘Puya’ and

‘Hamid’ for their unconditional love and support which kept me strong during the difficult times. I am so lucky to have them in my life. The last but not the least, thanks to my caring, supportive partner ‘Arron’ for putting up with ‘crazy me’ in the last few stressful months of wrapping up this project.

2

DECLARATION

This is to declare that the examinable outcome:

 contains no material which has been accepted for the award to the candidate of any other degree or diploma, except where due reference is made in the text of the examinable outcome;  to the best of the candidate’s knowledge contains no material previously published or written by another person except where due reference is made in the text of the examinable outcome.

Signature

Date

3

TABLE OF CONTENTS

1 INTRODUCTION TO THE THESIS ...... 12

1.1 Introduction ...... 12

1.2 Study Structure ...... 14

1.3 Summary ...... 18

2 THE HISTORY OF THE EVOLUTION OF SEM IN PSYCHOLOGY ...... 20

2.1 First Trend: Exploratory Factor Analysis...... 22

2.2 Second Trend: Confirmatory Factor Analysis (CFA) ...... 24

2.3 Third Trend: Factor Analysis of SEM (FASEM) ...... 25

2.4 Current Developments in SEM ...... 31

2.5 Conclusion ...... 33

3 THE EVOLUTION OF MODEL-BASED RELIABILITY ESTIMATES ...... 35

3.1 Introduction ...... 35

3.2 Classical Test Theory and Coefficient Alpha ...... 36

3.3 Major Problems with Using a Coefficient Alpha Reliability Analysis ...... 42

3.4 Unidimensional Model-based Reliability ...... 44

3.5 Recent Developments ...... 46

3.6 Summary ...... 56

4 THE VALIDITY OF BIFACTOR VERSUS HIGHER-ORDER MEASUREMENT MODELS ..... 59

4.1 Bifactor Model of WOAQ...... 61

4.2 Summary ...... 64

5 THE VALIDITY OF FORMATIVE MEASUREMENT MODELS VERSUS REFLECTIVE MODELS...... 65

5.1 Differences between Formative and Reflective Models ...... 66

5.2 Applications of Formative Models ...... 72

5.3 Developing a Framework for Distinguishing Reflective- Formative Models ...... 74

4

5.4 Measurement Model Misspecification in Organisational Psychology Literature ...... 79

5.5 Summary and Conclusion ...... 84

6 STUDY 1: MODEL-BASED RELIABILITY, VALIDITY AND CROSS VALIDITY OF BIFACATOR MODEL FOR WOAQ ...... 86

6.1 Rational and Objectives ...... 88

6.2 Method ...... 98

6.3 Summary ...... 106

7 STUDY 1: RESULTS ...... 108

7.1 Results: Study of Nurses-Validation of Bifactor Model of WOAQ ...... 108

7.2 Results: Study of Paramedics-Cross Validation of Bifactor Model WOAQ ...... 120

8 STUDY 1: DISCUSSION ...... 125

8.1 Discussion: Study of Nurses-Validation of Bifactor Model of WOAQ ...... 125

8.2 Discussion: Study of Paramedics-Cross Validation of Bifactor Model of WOAQ ...... 129

8.3 Strengths and Limitations ...... 130

8.4 Summary and Conclusion ...... 132

9 STUDY 2: APPLICATIONS OF COVARIATE-DEPENDENT RELIABILITY...... 134

9.1 Rational and Objectives ...... 134

9.2 Method ...... 143

10 STUDY 2: RESULTS ...... 147

10.1 Results of Application for Reliability Assessments – The study of WOAQ ...... 147

10.2 Model Fit Evaluation ...... 150

10.3 Application in Demonstrating CMB using Social Desirability ...... 156

11 STUDY 2: DISCUSSION ...... 169

11.1 Discussion: Application in Reliability Assessment of WOAQ ...... 170

5

11.2 Discussion: Application in Demonstrating CMB ...... 172

11.3 Strengths ...... 174

11.4 Limitations and Directions for Future Research ...... 175

12 STUDY 3: MODEL-BASED RELIABILITY AND VALIDITY OF REFLECTIVE-FORMATIVE MODEL OF WAS USING PLS-SEM ...... 177

12.1 Rationale and Objectives ...... 177

12.2 Method ...... 201

13 STUDY 3: RESULTS ...... 213

13.1 Results of Model Fit Evaluation ...... 213

13.2 Comparison of the Misspecified Models with the Correctly Specified WAS Model ...... 221

14 STUDY 3: DISCUSSION ...... 225

14.1 Implications for Work Ability Assessments...... 229

14.2 Limitations and Directions for Future Research ...... 230

15 SUMMARY ...... 233

15.1 Study 1: Model-Based Reliability, Validity and Cross Validity of the Bifactor Model for WOAQ ...... 233

15.2 Study 2: Applications of Covariate-dependent Reliability ...... 235

15.3 Study 3: Model-based Reliability and Validity of Reflective-formative Model of WAS ...... 236

15.4 Thesis contributions to SEM ...... 238

15.5 Summary ...... 242

16 APPENDICES ...... 243

16.1 PUBLISHED ARTICLES ...... 244

16.2 Validity and model-based reliability of the Work Organisation Assessment Questionnaire (WOAQ) among nurses ...... 244

16.3 Structural Equation Modeling in Psychology: The History, Development and Current Challenges ...... 257

6

16.4 Cross-validation of the Work Organization Assessment Questionnaire across gender: A study of Australian Health Organization ...... 268

16.5 EVALUATING A HIGHER-ORDER MISSPECIFIED REFLECTIVE MODEL OF WAS USING CB-SEM...... 283

16.6 DEFINITIONS OF IMPORTANT TERMS ...... 296

16.7 THE WOAQ AND ITS SUBFACTORS ITEMS...... 300

16.8 The -WAS questionnaire ...... 302

16.9 Appendix F. List of items used in construction of WAS ...... 323

16.10 Ethics clearance ...... 326

16.11 A List of Articles Included in the Review ...... 333

17 REFERENCES ...... 355

7

LIST OF TABLES

Table 7.1 Descriptive of the Demographic Variables ...... 109

Table 7.2 Subscales and WOAQ Items ...... 110

Table 7.3 Item Characteristics of WOAQ ...... 111 Table 7.4 Completely Standardized Maximum Likelihood (ML) Solutions of Higher order Model and the Bifactor Model ...... 117

Table 7.5 Summary of Model Fit Statistics of the CFA Models of WOAQ ...... 118

Table 7.6 The Reliability Coefficients of WOAQ among Nursing Sample (n=312) . 119

Table 7.7 Characteristics of Paramedic Participants ...... 121 Table 7.8 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Gender ...... 121

Table 7.9 Invariance Testing Across Gender for the Bifactor Model of WOAQ. ... 123 Table 10.1 The Descriptive Characteristics of the Main Study Constructs and Parameters (n=1255) ...... 148

Table 10.2 Nursing and Paramedic Demographic Characteristics ...... 149

Table 10.3 Mean Age Differences between Nursing and Paramedic ...... 150 Table 10.4 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Organisations ...... 151 Table 10.5 Summary of Model Fit Statistics of the Bifactor Models of WOAQ (n=1257) ...... 151 Table 10.6 WOAQ Reliability Statistics for Nursing (as reported in Chapter 7) and Paramedic Organisations ...... 152

Table 10.7 WOAQ Reliability Statistics across Organisations (n=1257) ...... 153 Table 10.8 Invariance Testing Across Organisations for the Bifactor Model of WOAQ ...... 154 Table 10.9 Summary of the Demographic Characteristics of the Participants (n=341) ...... 156 Table 10.10 Comparison of the Covariate-dependent and Covariate-free Reliability Coefficients of the Scales after Including CMB ...... 158

8

Table 10.11 Summary of Fit Indices of Comparison Models ...... 166 Table 10.12 Standardised Factor Loadings for Different Models Compared to Baseline ...... 167

Table 12.1 Items of the Work Ability Index ...... 184 Table 13.1 Quality Criteria of the Reflective-formative WAS First-order Constructs using PLS-SEM ...... 215 Table 13.2 Intercorrelation Analysis and the Square Roots of AVE of First-order Constructs of Reflective-formative PLS-SEM Model † ...... 217 Table 13.3 The Standardised Mean Coefficients of the Second-order formative Constructs of Reflective-formative PLS-SEM Model (n=5000 bootstrap) ...... 218 Table 13.4 Results for Third-order formative Constructs of Reflective-formative WAS (n=5000 bootstrap samples)...... 219 Table 13.5 Comparing the Standardized Path Coefficients of Misspecified and Correctly Specified WAS Models ...... 222 Table 13.6 Comparing the Model-based Reliability Coefficients of a Misspecified Reflective-reflective WAS (CB-SEM) with the Correctly Specified Reflective-formative Model of WAS ...... 224 Table 16.1 The Standardised Path Parameter Estimates for the Misspecified Reflective WAS Model Using the CB-SEM Procedure ...... 285 Table 16.2 The Parameter Estimates for First-order Reflective Model Using the CB- SEM Procedure ...... 286

Table 16.3 Intercorrelation analysis and the square roots of AVE for subfactors. 288 Table 16.4 Structural Model Results for Second-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap)...... 290 Table 16.5 Structural Model Results for Higher-order Reflective Constructs of Reflective PLS-SEM Model (n=5000 bootstrap samples)...... 291 Table 16.6 Structural Model Results for Second-order Reflective Constructs of Full Formative PLS-SEM Model (n=5000 bootstrap)...... 294 Table 16.7 Structural Model Results for Higher-order Reflective Constructs (n=5000 bootstrap samples)...... 295

9

LIST OF FIGURES

Figure 1.1. The study structures in this thesis...... 17 Figure 2.1. A pseudo path diagram and timeline for some of the developments in SEM modeling and SEM model structures...... 21

Figure 2.2. One of Wright's first path diagrams for genetic modeling...... 26 Figure 3.1. A pseudo path diagram and timeline for some of the developments in historical review of the conceptualisation and estimation of model-based reliability ...... 38

Figure 3.2. Demonstrating Omega reliability coefficients for WOAQ...... 49

Figure 3.3. A unidimensional construct with four indicators ...... 53 Figure 3.4. A covariate-dependent construct with four indicators and two covariates ...... 54

Figure 4.1. Higher-order vs. Bifactor model model of WOAQ ...... 63

Figure 5.1. First-order reflective model ...... 68

Figure 5.2. First-order formative model ...... 69

Figure 5.3. Higher-order reflective-reflective measurement model ...... 70

Figure 5.4. Higher-order formative-formative measurement model ...... 71 Figure 5.5. The developed framework for assessing the formative vs. reflective measurement models...... 78 Figure 6.1. A demonstration of Bifactor model of WOAQ with phantom variables for calculating omega coefficients on the righthand side ...... 105

Figure 7.1. The proposed bifactor model of WOAQ vs. higher order ...... 114 Figure 9.1. Covariate-dependent reliability assessment with the bifactor model of WOAQ across the nursing and paramedic organisations...... 136 Figure 10.1. The effects of latent factor bias (social desirability) on the reliability of the constructs...... 157

Figure 10.2. The proposed model for evaluating CMB/CMV...... 159

10

Figure 10.3 . Model 1: Baseline model when all the study constructs are correlated without controlling for CMV and CMB ...... 161 Figure 10.4. Model 2. Constrained equal loadings from CMV to the study indicators...... 163

Figure 10.5. Model 3. Free loadings from CMV to the study indicators ...... 165

Figure 12.1. Multidimensional work ability model...... 188

Figure 12.2. WAI scores: Australia and Finland...... 190

Figure 12.3. The correctly specified reflective-formative model of WAS...... 195

Figure 12.4. The misspecified reflective-reflective model of WAS ...... 196

.Figure 12.5. The misspecified formative-formative model of WAS...... 197 Figure 12.6 . Step one: constructing the first-order sub-constructs of both personal and organisational capacities of reflective-formative model of WAS using PLS path modeling...... 208 Figure 12.7. –Step two: building the second-order formative constructs (organisational and personal capacities) for the reflective-formative model of WAS...... 209 Figure 12.8. Step three: The scores of the first-order latent factors, are used as the manifests of the second-order factors (i.e. organisational and personal capacities) and forming the higher-order construct (WAS)...... 210 Figure 13.1. The final model of reflective-formative WAS development using PLS path modeling...... 220 Figure 16.1. The standardised path parameter estimates of the misspecified reflective WAS model ...... 284

Figure 16.2. The reflective model of WAS using PLS-SEM...... 289

Figure 16.3. The reflective WAS development using PLS path modeling...... 291 Figure 16.4. The model building process for full formative model of WAS using PLS- SEM...... 293 Figure 16.5. The full formative model of WAS using PLS-SEM ...... 295

11

1

INTRODUCTION TO THE THESIS

So here it is, my final version of the thesis. A thesis which was the best journey of my life. One which I fell in love with and developed over the years. A thesis that made me learn a lot and taught me to never stop learning. A thesis that was by my side for seven years, for all the ups and downs.

1.1 Introduction

The journey started with an initial interest in exploring the SEM-based validation of measures for formative constructs. At the start I was lost. I felt like I was walking in the dark with no hope of finding the light. But as time passed and the more I immersed myself in the subject, the more it all started to make sense. My first realisation was that constructs are not reflective - ‘by default’ - as the majority of scholars assume. A real life example exists right in front of us; social economic status indicators (SES). The three main components of SES are income, education and occupational status. Although these components may be correlated, they are measuring different constructs. Which means SES cannot be a reflective construct, as assumed by some scholars. I grew up in a middle-class family in a highly populated, developing country, where your social status is often defined by your parent’s income or occupational status. This experience showed me that a parent’s income and occupational status does not always depend on their level of education. This is a perfect example of a formative construct.

I then reviewed the literature to see how big the problem of model misclassification actually is, and as I was expecting, I found that it was big enough to do something about. It

12

is very difficult to evaluate formative models using the conventional SEM procedures, mainly due to identification problems. It is therefore no wonder that the majority of the scholars - ‘by default’ - consider their constructs to be reflective. The evaluation of formative models using the conventional Covariance-Based Structural Equation Modeling

(CB-SEM) procedure using conventional statistical software results in model identification problems. In my search for a solution I discovered and became familiar with Partial Least

Squares SEM.

Further in my research of formative models, I met with distinguished Professor

Peter M. Bentler at UCLA. After attending his statistics classes and frequent discussions,

SEM application began making more sense than ever before. During one of our meetings,

Peter mentioned Covariate-Dependent Reliability, prompting my second realisation.

The more I read and thought about it, the more I realised the potential applications for this approach. Why is it that some measurement scales do not have an acceptable level of reliability in all situations or populations? Is an IQ test derived within a European context applicable for a remote aboriginal community? Can the reliability of the IQ test that is associated with ethnicity be separated from the reliability that is independent of ethnicity? Clearly, the reliability of any measure can be influenced by covariates or cofounding variables. However, nobody seems to care about this issue, or if they do, they don’t know how to account for this. With permission from Professor Bentler, I started evaluating the application of Covariate-dependent Reliability. I was warned that this was a risky exercise given the novelty of the topic and the lack of previous literature. But

13

nevertheless, the importance of this topic pushed me out of my comfort zone and stretched the boundaries of my thesis.

Through my investigation of Covariate-dependent Reliability, I came to understand its application in demonstrating Common Method Bias (CMB), caused by factors such as social desirability. Not all students that fill in surveys about their drinking behaviours or emotional intelligence tell us the truth. I came to realise that it was possible to test for CMB in the following way. If CMB was to be treated as a covariate and we evaluated the reliability of measurement scales with and without CMB, if we happened to see changes in the reliability, then we could argue that CMB exists. Moreover we could extract the effect of the CMB.

Finally, after learning about the importance of bifactor models for multidimensional modeling, I was encouraged to also explore this neglected area. For the complex multidimensional measure of Work Organisation Assessment Questionnaire (WOAQ), bifactor analysis provided a significant improvement in the measurement model.

1.2 Study Structure

The main theme of this journey is the testing of model-based reliability and validity of measurement models using SEM. I have perused three new developments in this area in the following three studies:

1) The model-based reliability and validity evaluation in Bifactor Measurement

Models with special focus on the Work Organisation Assessment Questionnaire WOAQ; comparing the results with a second-order model.

14

2) The application of the newly developed theory of Bentler’s Covariate-dependent

Reliability, not only for reliability assessment but also for demonstrating Common Method

Bias (CMB); and

3) Evaluating Model-Based Reliability and validity in reflective-formative models, using Partial Least Squares SEM. By using this procedure, the existing misspecified model of WAS will be compared with a correctly specified model to highlight the impacts of model specification errors.

A summary of these three studies are presented in Figure 1.1. Clearly Covariance-

Based SEM is used for the first two studies in the context of reflective measurement models, and PLS-SEM is used for the third study in the context of reflective-formative measurement models. The theory for these methods is covered in chapters 2-5. In Study one

(Chapters 6 to 8) the validation of a bifactor model for the Work Organisation Assessment

(WOAQ) will be demonstrated in a health setting. Nothing like this has been carried out before using the Work Organisation Assessment Questionnaire (WOAQ). In addition, the model-based reliability coefficients (Omega total, Omega hierarchical and Omega subscales) will be computed and compared with the conventional coefficient alpha. The first part of the study was conducted with a sample of community nurses. The second part of this study concerns the cross validation of the bifactor model with a paramedic sample

(across gender), to find out if the bifactor model of Work Organisation Assessment

Questionnaire (WOAQ) has similar properties in another very different population within the health sector.

15

In Study two (Chapters 9 – 11), two applications of covariate-dependent reliability will be demonstrated empirically. This is the first time that applications of covariate- dependent reliability have been undertaken. One of these applications demonstrates reliability evaluations in the context of Common Method Bias (CMB). This application demonstrates that, if CMB exists, then the reliability of the measurements will be affected when you treat CMB as a covariate source of reliability.

In Study three (Chapters 12 – 14), an empirical example of fitting a reflective- formative measurement model using Partial Least Squares SEM is presented step by step.

To the best of the researchers’ knowledge, there has been no previous clear guideline or procedure for fitting a reflective-formative model in the literature of Partial Least Squares

SEM. This model is fitted for a work ability measure (WAS) allowing the testing of both validity and reliability. This reflective-formative model is compared with a misspecified reflective-reflective model, demonstrating the errors that occur as a result of misspecification.

16

Model-based reliability and validity

PLS-SEM CB-SEM (Reflective-formative models) (Reflective models) Chapter 5 Chapter 2-4

Study 1: Study 2: Study 3: Bifactor and higher- Applications of covariate- Reflective-formative model order model dependent reliability validtaion of WOAQ validation of WAS Chapters 9-11 Chapters 6-8 Chapters 12-14

Part I: Study of Part II: Study Part I: Study of Part I: Fitting Part II: Part II: study of nurses & of students- nurses – Validation hierarchical Validation and paramedics- paramedics- application in and Model-based models with model-based Cross validation application in common reliability of formative reliability of reliability method bias WOAQ of WOAQ constructs WAS Figure 1.1. The study structures in this thesis. Note: CB-SEM: Covariance-based Structural Equation ModelingPLS-SEM: Partial Least Squares Structural Equation Modeling; WOAQ: Work Organisation Assessment Questionnaire; WAS: Work Ability Scale.

17

Chapters 2 provides an overview of SEM and Chapter 3 covers model-based reliability, along with new developments, current gaps and applications in these areas.

Chapter 4 compares bifactor and second-order models and Chapter 5 compares reflective and formative models, presenting an overview of the history of misspecification for formative models. The misspecification rates for formative and reflective models are assessed for two top journals in Organisational Psychology in a 9 year period (2006-2014).

A solution to this model identification problem is provided by way of a decision flowchart for distinguishing formative from reflective models. Chapters 6 to 8 cover the validity, cross validity and model-based reliability of the Work Organisation Assessment

Questionnaire (WOAQ), using both bifactor and second-order factor models. Chapters 9 to

11 cover two new applications of covariate-dependent reliability, and also demonstrate how

CMB can be detected and measured using this approach. Chapters 12 to 14 concentrate on the validity and reliability assessments of reflective-formative models as opposed to misspecified reflective-reflective models using PLS-SEM. Finally all three studies are summarised in chapter 15.

1.3 Summary

In spite of the importance of multidimensional model-based reliability measurement, there is limited empirical study of model-based reliability coefficients. In addition, bifactor models and measures with formative constructs have received less attention in the literature compared to higher-order and reflective models. Either scholars do not recognise the importance of these topics, or the appropriate statistical software is not readily available for performing analysis. This research is designed to fill some theoretical

18

and methodological gaps in this area. Moreover, for the first time, this study demonstrates the practical implications of the newly introduced theory of covariate-dependent reliability of Bentler (2014) for reliability and common method bias assessment.

19

2

THE HISTORY OF THE EVOLUTION OF SEM IN PSYCHOLOGY

Structural equation modeling (SEM) is one of the major research tools that is rapidly growing in popularity. SEM is a statistical technique for testing and examining measurement models and causal relations, using a combination of statistical data and qualitative causal assumptions (Pearl, 2000). SEM techniques are a major component of applied multivariate statistical analysis, which are widely used by researchers in different disciplines.

In a broader sense, SEM represents a series of cause-effect relationships between variables in a composite testable model (Shipley, 2000). It extends on conventional multivariate statistical analysis by accounting for measurement error and by more thoroughly examining goodness-of-fit. The SEM technique has grown out of methods such as path and factor analysis.

SEM has attracted attention primarily because it lends itself to effectively studying problems or models that cannot be easily investigated using other approaches. In this chapter, a history of the original roots of SEM in psychology will be traced, followed by a discussion of the current developments in SEM. The structure of the chapter is based on the path diagram presented in Figure 2.1. The idea of showing the history of SEM using a graph originated from a personal conversation with Professor Peter Bentler in 2012. The researcher was inspired by the idea and extensively developed the graph to include all the major developments in SEM.

20

1900 1920 1940 1960 1980 2000

FORMATIVE PLS MODELS

RELIABILITY

PRINCIPAL MULTILEVEL COMPONENTS

BOOTSTRAPPING

MIXTURE

FACTOR CONFIRMATORY META ANALYSIS NONLINEAR FA ANALYSIS CLASSICA L TEST THEORY

PATH GLLAMM ANALYSIS

MIMIC FASEM

MODERN REGRESSION TEST THEORY

LINEAR SIMULTANEOUS BENTLER EQUATIONS /WEEKS

Figure 2.1. A pseudo path diagram and timeline for some of the developments in SEM modeling and SEM model structures. Acknowledgement: Special thanks to Professor Peter Bentler (personal communication, 2012), for his inspiration and input into development of the diagram.

2.1 First Trend: Exploratory Factor Analysis1

Exploratory factor analysis (EFA) has made an important contribution, especially in the social sciences, by addressing the needs and interests of various disciplines. The primary roots of SEM in psychology can be traced to the work of

Pearson (1901) on orthogonal least squares. Pearson’s theory was not fully appreciated at the time, but it later became a foundation for principal component analysis and correlation matrix analysis (Hotelling, 1933). Spearman (1904), an English psychologist, also contributed substantially. Spearman is commonly regarded as the pioneer of factor analysis, based on his work involving the finding of relationships between multiple correlated measures of cognitive performance. Factor analysis is defined as a type of statistical procedure that is conducted to identify clusters or groups of related items (called factors).

Using factor analytic data, Spearman postulated his original two-factor models for ability and intelligence, highlighting the theory testing nature of the method. Spearman found that children's scores on different subjects were connected. Spearman then extended his theory, proposing three types of factors: a general factor (g), referring to all activities; a specific factor (s) that refers to a specific mental activity; and a group factor which is common to some of the variables but not all. Other scholars gradually adopted this approach using factor analysis (e.g. Mosier, 1939; Guttman, 1952; Lawley, 1940;

Anderson and Rubin, 1956).

1 Shown in blue in Figure 2.1.

22

2.1.1 Multiple factor analysis

Spearman's two-factor theory was criticised widely. Thomson (1916 & 1935) strongly criticised the sampling theory approach in regard to abilities in the early stages of the development of Spearman’s two-factor theory. It was claimed that the analysis considered only a sample of possible abilities, making such an analysis incomplete.

However, the biggest critic of the Spearman model was Wilson (1928a, 1928b, 1929), a famous mathematician who made a significant contribution to the development of factor analysis. In different papers and using different examples, Wilson highlighted indeterminacy issues; lack of uniqueness in the variable (g) of Spearman’s theory; and the identifiability problem in the variance-covariance parameters of factor analysis. A number of scholars (such as Irwin, 1935, Thomson, 1935), added to Wilson’s work by further developing an understanding of indeterminacy (Steiger & Schönemann, 1978).

In the early stages of Spearman's development of factor analysis, some scholars (e.g.

Wilson, 1929; Thomson, 1935) suggested that factor indeterminacy might seriously affect the ultimate purpose of the model, making this a very important theoretical issue between 1928 and 1939. However, the focus moved away from factor indeterminacy when Wolfe (1940) wrote his objective historical review of factor analysis, until 1955, when factor indeterminacy again attracted attention.

The Spearman two-factor methods were also criticised on the grounds that they were not appropriate for situations that involved more than one group factor. In 1931,

Thurstone considered this as one of the serious limitations of Spearman's method, mainly because psychological problems usually involved two or more group factors

(Thurstone, 1935). This limitation led to an interest in multiple-factor analysis to supplement Spearman’s model, whereby group factors were identified after extracting a

23

general factor (e.g., Holzinger 1941). In multiple-factor analysis there are no restrictions on the number of general factors or the number of group factors (Thurstone, 1935).

Thurstone (1947) further developed his multiple-factor analysis using the centroid method of factoring for a correlation matrix, a pragmatic compromise to the computationally-burdensome principle axis method. This method of factor analysis attracted further attention in the 1960s (Bentler, 1968; McDonald, 1970). However, from the early 1980’s explicit optimisation functions, such as least squares, maximum likelihood (ML), and minimum chi-square, became more popular.

Thurstone (1947), with later contributions from Cattell (1978), developed the foundations for the concept of factor rotation. Other scholars extended Thurstone’s work by proposing practical solutions for rotation. The most popular rotation methods included the Varimax orthogonal rotation, which forced factors to be uncorrelated

(Kaiser, 1958); and various oblique rotation methods (Jennrich & Sampson, 1966;

Jennrich & Clarkson, 1980) which allowed the factors to be correlated. As a result of these developments exploratory multiple-factor analysis became popular during this period.

2.2 Second Trend: Confirmatory Factor Analysis (CFA)

There is obviously a connection between exploratory and confirmatory factor analysis methods. However, other statistical theory apart from exploratory factor analysis has also made a significant contribution to the development of confirmatory factor analysis (Bentler, 1986). These theories include analyses for higher order factors.

Although Thurstone (1947) seems to be acknowledged for proposing the mathematical foundation of second-order factor analysis, it was Jőreskog in 1970 who

24

wrote an equation including first and second-order factors as a single model and it was

Bentler in 1976 who offered a complete and general structure for higher-order factors.

The problem of rotating factor solutions was avoided when confirmatory factor analysis (CFA) came on board. In CFA, the factors and parameter loadings are identified before analysis starts, transforming the problem into one of identification of a model’s parameters from observed moments (Matsueda, 2012).

CFA was introduced originally by Tucker (1955). It was further developed following the introduction of an ML approach to factor analysis (Lawley, 1940;

Anderson & Rubin, 1956). Finally, it was Jöreskog (1969) who developed the first computer software programs for CFA estimation using ML.

2.3 Third Trend: Factor Analysis of SEM (FASEM)

Real progress in the evolution of SEM was produced by the integration of the earlier SEM developments in psychometrics, sociology, econometrics, and biometry

(Bentler, 1986). The factor analysis of structural equation modeling (FASEM) and the resulting linear structural relations (LISREL) software were the main outcomes of this integration. At the time, simultaneous equation and path analysis methods were the main new contributors to FASEM and LISREL.

2.3.1 Path Analysis

Sewall Wright WAS one of the first scholars to use path analysis in medical science when he started using this in his studies in the 1920s. Path analysis was one of the primary procedures used to determine a causal structure. Wright used observed variables to develop a correlation matrix, and drew path diagrams indicating direct and indirect effects.

25

Path analyses led Wright to develop the Multiple Indicators Multiple Causes

(MIMIC) model among others (Matsueda, 2012). Figure 2.2 presents an early path analysis by Wright (1920) indicating path modeling of heredity and environment in shaping the piebald pattern of guinea-pigs.

Figure 2.2. One of Wright's first path diagrams for genetic modeling.

Source: Wright, Sewall (1920). The relative importance of heredity and environment in determining the piebald pattern of guinea-pigs. Proceedings of the

National Academy of Sciences, 6, 320-332.

2.3.2 Simultaneous equation and errors-in-variables models in economics

The development of SEM in econometrics can be attributed perhaps to Frisch and Waugh (1933), Haavelmo (1943), and Koopmans (1945). Frisch (1934), the founder of the Econometric Society and the Econometrica Journal, invented the term

26

“econometrics” and developed many of the identification principles in SEM. The advances made by Haavelmo (1943), another economist, together with Mann and Wald, led to work on SEM at the Cowles Commission (1952). This resulted in Haavelmo solving the major problems of identification, estimation, and testing in SEM.

Koopmans et al. (1945) made some empirical advances on Haavelmo’s model.

However, according to Matsueda (2012), it was Klein (1950) who made the most significant contribution to the empirical application of simultaneous equation models using Keynesian economic models, culminating with the 15-equation Klein-Goldberger model estimated by limited-information methods (Klein & Goldberger 1955). Other scholars made further contributions to the model (e.g. Anderson & Rubin 1949; Zellner,

1962; Zellner & Theil, 1962).

Frisch (1934) first created an errors-in-variables model and then a graphical presentation of regression coefficients (the method of bunch maps) which was proposed as a tool to discover underlying structures, often obtaining approximate bounds for relationships. According to Hendry and Morgan (1989), Frisch treated observed variables as fallible indicators of latent variables, examining the interrelationships among all latent and observed variables to distinguish true relations from confluent relations.

Frisch’s errors-in-variables model was ignored until the early 1970s when

Zellner became interested and demonstrated the use of generalised least squares (GLS) and Bayesian approaches in estimating a model with a fallible endogenous predictor.

Later, Goldberger (1971) showed that GLS was equivalent to ML only when errors were normally distributed with known variances. He also showed that when error variances were unknown, an iterated GLS converged to ML.

27

According to Bentler (1986), Goldberger was one of the first researchers to realise the need to integrate some SEM-related ideas into other disciplines (Goldberger,

1971; Goldberger & Duncan, 1973). This integration was one of the turning points in the evolution of SEM in the 1970s.

2.3.3 FASEM

FASEM is a generic acronym for factor analysis (FA) structural equation modeling (SEM); a major development in the 1970s and 1980s. FASEM was first used by Bentler (1986) to refer to conceptual approaches to continuous variables in SEM.

The Conference on Structural Equation Models in 1970 contributed greatly to the integration of SEM disciplines. The conference was an interdisciplinary forum of economists, sociologists, psychologists, statisticians, and political scientists and the academic papers were published in a volume of Structural Equation Models in the

Social Sciences by Goldberger and Duncan in 1973.

According to Bentler (1986), the major achievements in the 1970s and 1980s can be categorised into three sections: structural concepts, statistical theory and practical development. The two key papers published in this period were written by Hauser and

Goldberger (1971) and Jöreskog (1973). Hauser and Goldberger’s (1971) examination of unobservable variables is an exemplar of cross-disciplinary integration, drawing on path analysis and moment estimators from Wright, as well as work by sociologists. It also incorporates factor-analytic models from psychometrics, efficient estimation, and

Neyman-Pearson hypothesis testing from statistics and econometrics. Hauser and

Goldberger used limited-information estimation to gain a better understanding of structural equations estimated by ML. Jöreskog (1973) presented an ML framework for estimating the parameters of these SEM models, developed a computer program for 28

empirical applications, and showed how the general model could be applied to a myriad of important substantive models.

2.3.4 Nonlinear SEM

The turning point in the application of SEM in psychology dates back to the

1970s and 1980s, primarily through the work of Bentler and, more particularly, his development of the EQS SEM software (Matsueda, 2012). Using such analytical software for evaluating SEMs allows researchers to make better use of their data and to study the empirical applications of some of the methods proposed by certain scholars in the literature (Bentler, 1986). During the 1980s some researchers paid attention to nonlinear SEMs, which helped to extend the overall scope of SEM. Some important developments in nonlinear latent variable SEM, particularly those for categorical data, emerged in the 1980s, mainly in the works of Bock and Aitkin (1981), Mislevy (1984) and Muthén (1984).

2.3.5 Formative models

The first appearance of formative measures probably dates back to the Berkson error model for radiation epidemiology studies described below. In the 1950s, the U.S. carried out nuclear testing in the state of Nevada. Due to the sudden increase in thyroid disease in surrounding areas, a major epidemiological study was carried out at the

University of Utah to evaluate the outcomes of radiation on health. The researchers found that the main exposure to radiation came from milk and vegetable consumption.

Based on that finding, the people in the study who had similar milk intake were assigned to the same dose group. Because the effect of radiation on the thyroid cannot be observed directly, Berkson designed a method in which the true exposure to radiation

29

(true score) was a function of the amount of food consumption (observed score) with some degree of uncertainty (measurement error).

In Classical Test Theory, a true score, with its measurement error, forms an observed variable, while in the Berkson error model it is the opposite in that the true score is equal to observed score plus measurement error (Carroll, Ruppert, Stefanski, &

Crainiceanu, 2006). The Berkson measurement error concept has become the cornerstone of what is today known as formative models. Although the concept of formative measures was introduced by Berkson in 1950, it did not attract attention until the late 1960s. Influenced by principal component and composite-like ideas, attention to using formative measures in SEM has since increased. The biggest surge in the use of formative models in certain situations occurred in the early 2000’s. Many scholars (e.g.

Blalock, 1971; Diamantopoulos and Winklhofer, 2001; Jarvis, MacKenzie & Podsakoff,

2003; Petter, Straub & Rai, 2007) have alerted researchers to the relevance of formative models in specific situations; however, this fact has unfortunately been underemphasised in the literature.

2.3.6 Multiple-indicators multiple-causes model (MIMIC)

One of the techniques implemented by Wright in the 1920s using path analysis, is similar to what is now known as a MIMIC model (Matsueda, 2012). The main advancement in MIMIC was achieved through the works of Jöreskog, Hauser, and

Goldberger in the 1970s. They introduced ML as the estimation method for over- identified MIMIC models.

The release of the LISREL statistical software by Jöreskog in the 1970s produced the greatest advancement in estimating MIMIC models. LISREL is still popular among scholars because of its ability to incorporate factor analysis, path 30

analysis, and FASEMs into a general covariance structure model (Jöreskog and Sörbom

2001; Matsueda, 2012). Using the MIMIC model, identification and estimation of formative models has become feasible in some circumstances.

2.4 Current Developments in SEM

Some of the most current developments in SEM include multilevel models, generalised linear latent and mixed modeling (GLLAMM), partial least squares (PLS) and SEM-based meta-analysis.

2.4.1 Multilevel models

Multilevel models can be estimated using multiple indicator measurement models in SEM. Using this approach separate models for within and between group covariances are considered. Further, by using a multiple group analysis, the parameters can be calculated simultaneously for both levels (Muthén, 1994). Although this estimation method can be applied using almost any SEM software, this is generally only for a few specific models.

2.4.2 GLLAMM

As mentioned above, the use of multilevel models is limited to specific models and cannot be applied to all models. In response to this limitation, a more advanced and general estimation method, GLLAMM, was introduced by Rabe-Hesketh, Skrondal &

Pickles (2004), and further developed by Skrondal and Rabe-Hesketh’s (2004).

GLLAMM has three main components: a generalised linear model, a structural equation model for latent variables, and distributional assumptions for these latent variables

(Matsueda, 2012). The generalised linear model is capable of analysing all types of data; continuous, ordinal, dichotomous and discrete. The GLLAMM program is now

31

part of the program. Many of the GLLAMM models can also be analysed by

MPlus, a powerful software package developed by Muthén and Muthén (2004).

2.4.3 PLS

The roots of PLS, as well as graphical models, can be traced to in

1977 (Geladi, 1988). Wold's PLS modeling was enhanced by the idea of principal component analysis as well as Jöreskog's LISREL software program.

Originally, PLS was developed to solve the problem of multicollinearity in multiple . According to Wold (1979), PLS regression was an appropriate estimation method for complex models with undeveloped theoretical backgrounds. The original application of PLS was more for predictive models (Barclay,

Higgins, & Thompson, 1995). Later, as an alternative to Jöreskog’s covariate-dependent

SEM approach, Wold introduced SEM based on PLS. Because PLS-based SEM has fewer underlying restrictions, such as normally distributed data and a large sample size, it is known as soft modeling. Despite its less restrictive nature, PLS-based SEM did not become as popular as covariate-dependent SEM. The main reason for this was a lack of software for model estimation.

However, since 1984, and especially from the early 2000s, more user-friendly software has been introduced for the estimation of PLS-based SEM, adding to the popularity of the method. Software such as PLS-GUI (Li, 2005), Visual PLS (Fu,

2006a), PLS-Graph (Chin, 2004), SmartPLS (Ringle et al., 2005), SPAD-PLS (Test &

Go, 2006) and XLSTAT (Addinsoft, 2008) are among the recent developments

(Morales, 2011).

32

There has been much debate among the scholars about the application of PLS and the lack of a goodness-of-fit test. These issues are discussed in Chapter 3.

2.4.4 SEM-based meta-analysis

The concept of SEM-based meta-analysis was introduced by Cheung (2008) to integrate SEM results from different studies. As a result, studies in meta-analysis are relevant to SEM. Although Cheung's proposed approach added a new and important methodological development in SEM, it is not yet fully incorporated into the current popular SEM software, limiting its further application in practice.

2.5 Conclusion

SEM is rapidly growing in popularity as a major research tool in psychology.

The early foundation of SEM can be traced back to factor analysis, principal component analysis, regression and path analysis. It started in various disciplines such as psychometrics, sociology, econometrics and biometry. The Interdisciplinary Conference on Structural Equation Models in 1970 greatly influenced the integration of SEM work in these disciplines. The work of Bentler and, especially, his development of EQS in

1970, was another turning point for the application of SEM in psychology. Since then,

SEM has rapidly developed. In particular MIMIC models were developed for the fitting of formative models in some circumstance. Other recent developments such as PLS,

GLLAMM and multilevel models have extended the application of SEM techniques to a higher level.

Although this area is progressing rapidly, there is a risk that the technique will be misused due to its complexity or lack of knowledge among psychological researchers. Some of the most controversial debates relate to model-based reliability,

33

model misspecification (formative vs. reflective) and the use of Partial Least Squares

SEM (vs. covariance-based SEM). These three issues are described in more detail in the following three chapters to highlight their importance to researchers.

34

3

THE EVOLUTION OF MODEL-BASED RELIABILITY ESTIMATES

3.1 Introduction

The literature on reliability has developed following the introduction of classical test theory in the 1900s. Coefficient alpha (α) has been widely utilised as a coefficient of internal consistency for tests (measurement scales) where overall scores are generated from the summation of test items (Bollen, 1989; Miller, 1995). Despite its popularity, the application of coefficient alpha as a reliability estimate has been contentious and has been subjected to numerous criticisms by scholars such as Green and Hershberger (2000), Green and Yang (2009a) and Sijtsma (2009a). Some scholars argue that coefficient alpha has been commonly misinterpreted as a measure of test homogeneity or unidimensionality (Green & Yang, 2009b; Miller, 1995). Other scholars such as Miller (1995) as well as Rogers, Schmitt, and Mullins (2002) claim that coefficient alpha may not be suitable for multidimensional composites. From a differing viewpoint, others claim that the conventional coefficient alpha leads to the overestimating or underestimating of true reliability (Raykov, 1997, 1998; Miller,

1995). Based on Raykov (1998) and Bentler (2009) the coefficient alpha is correctly estimated only when there is no correlation between error terms and the assumption of essential Tau Equivalency is met. This term is explained below.

Due to the limitations of coefficient alpha, over the past decades attention has shifted to a model-based internal consistency coefficient for measuring test score reliability. Some scholars have embraced Structural Equation Modeling (SEM) approaches for the estimation of model-based reliability as an alternative to coefficient 35

alpha to improve the reporting of psychometric internal consistency (Sijtsma, 2009b).

Unlike classical test theory which considers true-score variance, the SEM approach to model-based reliability focuses on the composition of the true score. This means that the true-score variance is partitioned into variance components allowing the researcher to consider the importance of the different variance components that contribute to test- score reliability (Sijtsma, 2009b).

For the purpose of this chapter the evolution of model-based reliability estimates will be explored. The focus will initially be mainly on model-based reliability assuming a unidimensional reliability coefficient (Jöreskog, 1971; McDonald, 1985; Bentler,

2007), however, reliability coefficients for multi-dimensional and bifactor models will also be considered, in the form of the Omega, Omega total, Omega hierarchical and

Omega subscale reliability coefficients ((McDonald, 1978, 1999; Zinbarg, Revelle,

Yovel, & Li, 2005; Reise, Bonifay, & Haviland, 2012). The newer theory of covariate- dependent and covariate-free reliability of Bentler (2014) will also be discussed. The above mentioned model-based reliability coefficients are estimated using CB-SEM. For completeness, the chapter will also briefly discuss composite reliability (CR) using

PLS-SEM. The application of composite reliability in scale models, involving formative constructs, will be elaborated upon in a later chapter.

3.2 Classical Test Theory and Coefficient Alpha

Constructs or latent variables are commonly used to classify or group similar behaviours or attributes. However, constructs in psychology are usually measured indirectly, through tests, surveys, or tasks. Designing such measurement instruments

(scales) for measuring constructs is challenging. The test developer must deal with many measurement problems. The study of measurement problems, including the extent

36

to which they influence the measurements and methods for dealing with these problems, has evolved into a specialised discipline known as Test Theory. Test Theory “provides a general framework for viewing the process of instrument development” (Crocker and

Algina, 1986; p. 7).

Historically the roots of Test Theory were developed mainly by psychologists from Europe and the United States. In Europe, the early development of Test Theory dates back to the mid 1800s with the work of Wilhem Wundt, Ernst Weber, Gustav

Fechner and their colleagues in Germany. In Great Britain, scientists including Sir

Francis Galton, Charles Darwin and Karl Pearson were among the main scholars who significantly contributed to the development of Test Theory.

37

Early Roots Recent Developments 2000 1900 1950 1970

Latent Variable Model Reliability rho (Bentler, 2007)

Unidimensional Reliability (Jöreskog, 1971; Heise &

Bohrnstedt, 1970) SEM)

dd Coefficient alpha based SEM - Cronbach (1951) Covariate-Dependent & (CB - Covariate-Free Reliability (Bentler, 2014)

Covariance

Multidimensional Omega Reliability (McDonald, 1978) Internal Consistency Omega Hierarchical, Omega RELIABILITY subscales and Omega Total Kuder and Richardson (1937) (McDonald, 1999; Zinbarg, Hoyt (1941) Revelle, Yovel, & Li, 2005; Reise, Bonifay, & Haviland, Guttman (1945)

SEM) ρ - Composite reliability ( c ) (Werts, Linn & Joreskog, 1974)

(PLS

Partial Least Squares SEM

Figure 3.1. A pseudo path diagram and timeline for some of the developments in historical review of the conceptualisation and estimation of model- based re38liability

Between 1905 and 1908, the French psychologists Alfred Binet and Theophile

Simon established an example of psychological assessment that has stood the test of time. They successfully created an intelligence test (IQ) to measure the level of intellectuality in children. The empirical test analysis and the advanced concept of norm can be attributed to the work of Binet and is still used by modern test developers.

Early in the 20th century, American scientists made some progress developing

Test Theory further. In 1904, E. L. Thorndike published the first textbook on Test

Theory and James McKeen Cattell acknowledged the significance of norms and errors in observations. The founding of the Psychometric Society in the 1930s promoted further advancements for the establishment of Test Theory. Through the Psychometrika and Educational, and Psychological Measurement journals, more opportunities were available for scholars to exchange ideas and theories in this field.

In 1986, Galton's study of students from Cambridge University showed that mental abilities could be distributed as a normal curve allowing the application of statistical techniques to psychological test data. Karl Pearson’s work with the computational formula of the correlation coefficient followed. The procedure known as factor analysis was originally developed based on an advanced set of correlational procedures designed by Charles Spearman, later becoming one of the most popular statistical procedures for assessing the validity of measurement instruments.

The importance of Test Theory in research and evaluation is well recognised. In order to achieve accurate, comparable outcomes, it is crucial for researchers to adhere to the principles of Test Theory for developing or testing measurement instruments and to evaluate the accuracy and sensitivity of these tools before utilising them for research purposes.

39

The reliability coefficients measure is defined as “… the degree to which individuals’ deviation scores, or z scores, remain relatively consistent over repeated administration of the same test or alternate test forms” (Crocker & Algina, 1986; p

105). In Test Theory, different types of reliability are introduced. ‘Test-retest’, ‘Parallel

Forms’ and ‘Internal Consistency’.

‘Parallel Forms’ reliability is based on creating two scales which provide composite scores for measuring the same construct (Nunnally & Bernstein, 1994). This reliability measure is calculated as the squared correlation between the composite scores of the two scales. However, although this procedure is a good procedure to identify sources of error variance (Nunnally & Bernstein, 1994), it is hardly used in the psychology literature.

‘Test-retest’ reliability is more commonly used. Instead of creating two different scales and comparing the results as in parallel forms reliability, the consistency of the responses over different time points are considered. Random measurement errors are one of the main sources of inconsistency in the responses of individuals over time.

However, given the often limited time interval between test and retest, the accuracy of the procedure has been criticised in the literature (Nunnally &Bernstein, 1994).

‘Internal Consistency’ reliability is less complex than parallel forms and test- retest, in that a single scale is measured at only one time point. Two popular procedures for estimating internal consistency reliability are ‘split-half’ and ‘Cronbach’s alpha’

(hereafter called coefficient alpha).

In the ‘split-half’ procedure, the same scale will be split into two parts and the correlations between the two parts are compared. The stronger the positive correlation between the two parts of the scale the better the internal consistency of the scale. There 40

are a few limitations with the split-half procedure. Firstly, there is no clear procedure or justification for splitting the scale into halves. Secondly, for time-limited testing, such as ability or IQ measurement, with items arranged from easy to hard, the reliability estimates may be upwardly biased (Cronbach, 1960). Thus due to these limitations, coefficient alpha was introduced by Cronbach (1951) as an average reliability of all possible split-half estimates for estimating internal consistency.

Coefficient alpha was first cited in Cronbach’s famous article in Psychometrika

(1951). Other scholars (e.g. Kuder and Richardson, 1937; Miller, 1995) are credited with the further development of this measure. In particular Kuder and Richardson generated variance estimates for this measure using the mean of a series of reliability coefficients calculated from a single study using a random split of items. Later, Hoyt

(1941) proposed a conservative estimation procedure for assessing the reliability of a scale based on an analysis of variance decomposition of the data. This estimation procedure delivers similar results to the KR20, described below, but underestimates reliability.

As explained above the coefficient alpha formula was proposed as the mean of all possible split-half coefficients for a particular scale with

n s2 n ∑ i α =1 − i=1 Equation 3.1 ns−12 x 

2 where the number of items is n , the estimated variance of item i is si and the estimated

2 variance of the scale (X) is sx . A value closer to one suggests a scale with better internal consistency.

41

3.3 Major Problems with Using a Coefficient Alpha Reliability Analysis

Based on Cronbach's approach (1951, p 331-332) there are some essential assumptions that should be considered before applying coefficient alpha to evaluate internal consistency. Unfortunately, these assumptions are ignored by many researchers, resulting in concerns regarding the validity of coefficient alpha results for internal consistency evaluation. Three assumptions should be considered before using coefficient alpha for internal consistency evaluation. These assumptions are “essential tau equivalency”, “uncorrelated errors” and “uni-dimensionality”.

Essential Tau Equivalency assumes that each item makes an equal contribution of variance to the true scale variance (Green and Yang, 2009a). However, equal factor loadings are seldom found in a scale and, moreover, the majority of scales are multidimensional with unequal variances explained by each dimension. Thus, the

Essential Tau Equivalency assumption, is often violated (Sijtsma, 2009). In violating the assumption, a negatively biased reliability coefficient is possible (Green & Yang,

2009; Sijtsma, 2009). As a result, the alpha coefficient often underestimates true reliability (Green & Yang, 2009; Sijtsma, 2009).

Uncorrelated Errors assumes no correlation between the item errors ( ei ) when

th the i item (xi) is expressed as a linear function of the factor (f)

xii=λ fe + i Equation 3.2

This assumption is also commonly invalid (for details see Green & Yang, 2009;

Sijtsma, 2009; Bentler, 2009). Violating the uncorrelated errors assumption leads to several problems. For example, Bentler (2009) explains that violating the assumption results in overestimated alpha coefficients because of unwanted systematic variance.

42

Other scholars (e.g. Green & Yang, 2009a; Sijtsma, 2009) argue that violating the assumption can lead to either overestimation or underestimation of coefficient alpha.

Due to the common violation of coefficient alpha’s assumptions, the use of coefficient alpha is criticised by several researchers (Bentler, 2009; Green &

Hershberger, 2000; Green & Yang, 2009; Sijtsma, 2009). This has led to several improvements being suggested. These suggestions include reporting the greatest lower bound (glb) for coefficient alpha as a measure of internal consistency (Sijtsma, 2009), and Bentler’s dimension-free lower bound for reliability (blb)1 (1972).

These above mentioned recommendations are not always appropriate for computing reliability coefficients for several reasons. The assumption of blb and glb are specified at a population level (lower bound reliability) and assume no sampling covariance. As stated by Bentler (2009), “in practice, sample covariance and correlation matrices must be used in the computation instead of their population counterparts, which are essentially never available” (p. 141).

In addition, they also assume uncorrelated error terms and/or no known dimension for the factor model. Thus, in the presence of correlated errors or strong theoretical and empirical knowledge on the dimensionality of the model, it does not seem appropriate to use the blb or glb.

Therefore, Coefficient alpha and the above blb or glb related measures are not appropriate when:

1 Bentler’s dimension-free lower bound reliability ( ρblb ) is proposed by Bentler (1972) based on no assumption on number of factors. Under the same assumptions glb and blb are equal.

43

a) the assumptions of using coefficient alpha are violated,

b) the dimensionality of the measurement model is already established (as a unidimensional or multidimensional), and

c) the model fits the data well.

In many situations a model-based reliability measure is preferable to coefficient alpha and the above blb or glb related measures. This leads to the discussion of unidimensional and multidimensional model-based reliability estimates which will be discussed in the next section. These are based on sample covariances matrices and have weaker assumptions than Coefficient alpha.

3.4 Unidimensional Model-based Reliability

In response to coefficient’s alpha limitations and within the setting of confirmatory factor analysis, the analysis of congeneric measures was introduced by

Jöreskog (1971) to calculate the uni-dimensional model-based reliability coefficient ρ11

. This reliability coefficient ρ11 is perhaps one of the earliest proposals for assessing the reliability of 1-factor models which does not require equal item reliabilities

(Gerbing, & Anderson, 1988). Using Maximum Likelihood (ML) estimation, ρ11 can be estimated in SEM using the following formula when item residuals (ei) are assumed independent and k items with loadings λi, are included in a scale.

2  k  ∑λi  ρ =  i=1  11 2 Equation 3.3  k  k ∑ λi  + ∑Var(ei )  i=1  i=1

44

The assumptions of essential tau equivalency or equal variance among items is less important for this coefficient. The reliability coefficient will not be affected by large differences between item variances. But in the presence of equal factor loadings

and item variances, ρ11 is equal to coefficient alpha.

Similarly to reliability coefficient ρ11 proposed by Jöreskog (1971), the ρt coefficient of Zimmerman (1972), defined below, is useful for estimating the model- based reliability of 1-factor models when the assumption of equal factor loadings across all items is not met or when errors terms are correlated (McDonald, 1978, Raykov,

2001). However, as with ρ11 , when we have a unidimensional construct with equal factor loadings and error variances for all items, with no correlation between the

ρ residuals, the numerical value of coefficient ( t) will be equivalent to that of coefficient alpha (Raykov & Shrout, 2002).

2 k ∑λi i=1 Equation 3.4 ρt = 2 kk ∑∑λi ++Var() ei 2 ∑ Cov (, eij e ) i=11i= 1≤

Akin to coefficient alpha, the above mentioned methods involve a one-factor model which explains a set of items and are therefore not suitable for instruments and scales that are multidimensional. However, it should be noted that the uni-dimensional rho (ρ) reliability coefficients mentioned previously, can also be interpreted as a unidimensional measure that quantifies the proportion of variance due to the most reliable single dimension in a multidimensional space (Bentler, 2007).

In order to address the need for multi-dimensional reliability measures the

Omega (ω) procedure was developed using multi-dimensional measurement models

45

fitted using SEM. McDonald’s (1978) coefficient omega (ω) is defined below in the context of a 2-factor model with loadings (λij) for the ith item on the jth factor (ηj) and errors (ei) for the ith item, i=1, 2,…,k. It represents the ratio of the true variance to the observed variance for this measurement model.

 k 2  Var∑ ∑ λijη j   i=1 j=1  ω = Equation 3.5  k 2 k  Var∑ ∑ λijη j + ∑ ei   i=1 j=1 i=1 

More recent developments have addressed more complex measurement models such as the bi-factor models (Gignac, 2013; Reise et al., 2012) described below. This is, an under-investigated area in psychometrics, in which a general factor exists alongside sub-factors (Revelle, & Zinbarg, 2008; Zinberg, Revelle, Yovel, & Li, 2005).

3.5 Recent Developments

3.5.1 Omega Hierarchical and Omega Subscale for Bi-factor Models

Using the same Omega formula described above, Omega hierarchical (ωh ) and

Omega subscale (ωs) together estimate the degree of proficiency of a test measure in

assessing the reliability of a hierarchical or bi-factor model. Omega hierarchical (ωh ) is applicable for assessing the reliability of only the general factor loadings of a bifactor model. More specifically, it refers to a measure of the variance in total scores that arise from the general factor running across all the items (Reise, Bonifay & Haviland, 2013).

The degree of reliability of the proposed subscale scores can then be evaluated after controlling for the variance generated from the general factor. This procedure

creates reliability measures known as Omega subscale (ωs ) for each subscale (Reise et

46

al., 2012) using the same Omega formula for each subscale. Reise, et al. (2012) advocate the reporting of these reliability indices for all subscales (see Figure 5 for an example). Reporting the Omega subscales is also very useful in bifactor models when the plausibility of subscales are of special interest. Omega hierarchical and omega subscales can be easily estimated using the R psych package (Revelle, 2013) and

AMOS. In addition, by calculating a confidence interval for the omega reliability coefficients more useful estimates will be obtained.

A bi-factor model with 5 subscales is illustrated in Figure 3.2. The extent to which multidimensionality affects both the general factor and subscale scores can be

appraised more accurately when the corresponding ωh and ωs values are reported in the case of bifactor models (for more details on bifactor models, please see Chapter 4).

The formulae for Omega hierarchical (ωh) and Omega subscale (ωs) are provided below in the case of k items (i = 1, 2,…,k) contributing to a general factor with loadings λgi and P subscales with loadings λSj,i for i =1, 2,… SJ for each subscale Sj j=1,2,..P.

2  k  ∑λgi  ω =  i=1  h 2 2 2 2 Equation 3.6  k   S1   S 2   SP  k ∑λgi  + ∑λS1i  + ∑λS 2i  + ... + ∑λSPi  + ∑Var(ei )  i=1   i=1   i=1   i=1  i=1

2  S1  ∑λS1i  ω =  i=1  s1 2 2 Equation 3.7  S1   S1  S1  λ  +  λ  + Var(e ) ∑ gi ∑ S1ii ∑ i  i=1   i=1  i=1

47

2  S 2  ∑λS 2i  ω =  i=1  S 2 2 2 Equation 3.8  S 2   S 2  S 2 ∑λgi  + ∑λS 2i  + ∑Var(ei )  i=1   i=1  i=1

etc. where the items i=1,2,…S1 all belong to the S1 scale and the items i=1,2,…S2 all belong to the S2 scale, etc. Combining these reliabilities the total reliability of the P- factor measurement model is obtained using Omega total (ωt)

2 2 2 2  k   S1   S 2   SP  ∑λgi  + ∑λS1i  + ∑λS 2i  + ...+ ∑λSPi  ω =  i=1   i=1   i=1   i=1  t 2 2 2 2 Equation 3.9  k   S1   S 2   SP  k ∑λgi  + ∑λS1i  + ∑λS 2i  + ...+ ∑λSPi  + ∑Var(ei )  i=1   i=1   i=1   i=1  i=1

48

ωs1

ωs2

ωh ωs3

ωs4

ωs5

Figure 3.2. Demonstrating Omega reliability coefficients for WOAQ

Note: ωh =Omega hierarchical; ωs = Omega subscale

3.5.2 Covariate-dependent and covariate free reliability

Bentler’s approach to covariate-dependent and covariate-free reliability based on coefficient rho is the next new development to be discussed. To establish an

49

acceptable level of reliability in measurement instruments, there should be positive intercorrelations among indicators, especially when these indicators are supposed to represent a single latent construct (Nunnally, 1978; Zeler & Carmines, 1980).

Reliability is usually quantified with a reflective measurement model that implies that the latent factor generates the systematic responses to a set of items (Bollen & Lennox

1991). In such a model, since error and specific scores are confounded, an increase in error variance or specific item variance implies a decrease in internal consistency

(Green & Yang, 2009).

More recently, Bentler (2014) introduced the concept of covariate-dependent and covariate-free reliability that partitions total reliability into parts based on external covariates and a part which is unaffected by such covariates. The following material on covariate-dependent reliability is adapted from either personal conversations with

Bentler (2012, 2013) or Bentler (2014). Only the practical application of this concept is assessed in this study.

Suppose that the covariance matrix for a given set of p variables Xi can be

modelled as Σ=Σc +Ψ, where Σc is the part of the covariance matrix that contains all influences of latent common factors on the observed variables and Ψ is the covariance matrix of the error (unique or residual) variation. In a confirmatory factor model with

Σ = ΛΦΛ′ + Ψ , Σc = ΛΦΛ′ represents the factor-implied covariances of the variables.

In this case the reliability coefficient rho, describing the internal consistency of the sum

p score XX= ∑ i , is calculated using a unit-weighting vector 1 as: i=1

11′Ψ ρ =1 − Equation 3.10 XX 11′Σ 50

Where 11′Ψ is the sum of the unique or error variances associated with the X-variables,

2 and σ x =11′ Σ is the sum of all the elements in the model-reproduced covariance matrix

of the X-variables. Clearly, ρ XX represents the proportion of construct-based variance to the total variance of the sum score (Figure 6).

Now suppose a model contains latent variables that are influenced by a set of covariates that we may call Z-variables. For simplicity, we consider only models that have a single latent factor, say F. Then, assuming that the covariates Z predict F, i.e., that the model contains one or more ZF paths, the covariate-free rho is:

2 ⊥ ∆⋅(1′ Λ ) ρ Z = Equation 3.11 XX 11′Σ

Where ∆ is the variance of the residuals in the regression of F on the Z- variables and

1′Λ is the sum of the factor loadings in the unstandardised solution.

The covariate-dependent or covariate-dependent rho can then be defined as

()ZZ⊥ ρXX= ρρ XX − XX . Equivalently, it can be calculated as:

(γ′′ φγ )(1Λ ) 2 ρ ()Z = Equation 3.12 XX 11′Σ where γ ′ is the row vector of regression coefficients of F on the Z covariates and φ is the covariance matrix of the Z’s. In EQS, F may be called F1 and the residual in the regression ZF may be called D1. Then γ′ φγ can be most simply computed as var (F1)

– var(D1), where var(F1) comes from the model reproduced covariance matrix and var(D1) is a parameter estimate obtained from the residual variance of the model

(Bentler, 2006; 2014; personal communications 2012, 2013).

51

3.5.2.1 Interpretation of Covariate-dependent Reliability. Assuming a group covariate, Bentler (2014; personal communications, 2012, 2013) defines the covariate- dependent reliability as “… a measure of the effect of group differences on the trait being measured relative to total variation, while the covariate-free reliability is a measure of the reliable individual difference variance freed from any mean differences due to the covariate(s)”.

The traditional view of reliability is defined as a measure of stable individual difference variation (if the data comes from individuals) relative to total variation. But in Bentler’s view (personal communications, 2013), “… any individual score might be influenced by other sources, including group and other individual differences”. For example, if nurses measure a wound size using a wound-measurement device, how much of the accuracy in the measurement is due to individual differences, and how much is due to factors such as level of experience or training? In this case accuracy/reliability in wound measurement is measured with many indicators, and the latent factor (= true score) is the trait of interest, with the researcher obtaining reliability measures in the usual way.

A path diagram of four indicators, V1-V4, as measures of a construct (F1), shown in Figure 3.3, can be used to illustrate the standard and covariate-dependent reliability measures discussed above. In the standard case, one may assume the unidimensional model described below.

52

λ1

λ2

λ3

λ4

Figure 3.3. A unidimensional construct with four indicators

The coefficients λλ14 to are constants representing the strengths of the effect of F1 on the various indicators (observed variables V1-V4). Typically these are called factor loadings. Letting the factor F1 be F and the random measurement errors E1-E4 be

EE14 to , the diagram corresponds to the following measurement equations:

V1=λ11FE+

V2= λ22FE+

V3=λ33FE+

V4= λ44FE+ Equation 3.13

Standard internal consistency reliability coefficients attempt to provide the proportion of variance in the scale (sum) score V1+V2+V3+V4 that is due to F. There is no further variance partitioning.

Covariate-dependent reliability illustrated in Figure 3.4 by adding two covariates

(V5 and V6) to the model that predict the latent factor F.

53

Figure 3.4. A covariate-dependent construct with four indicators and two covariates

By path tracing, one can determine from Figure 3.4 that the variance of F1 (F) can be partitioned into the variance due to the covariates V5 and V6, plus the variance due to the residual D1. The former is used to yield the covariate dependent factor variance, while the latter represents the part of the variance of F1 that is covariate-free.

These variances are then used in the model-based reliability formulae given above for covariate-free and covariate-dependent reliability. However, Coefficient alpha can also be partitioned in this way.

3.5.2.2 Covariate-dependent and Covariate-free Partition of Coefficient Alpha.

Coefficient alpha, previously given in equation 3.1, represents an estimate of

p

XX= ∑ i the reliability of i=1 . Partitioning coefficient alpha into a part due to

covariates and a part unaffected by covariates requires another approach. As

presented by Bentler (2014, personal communications, 2012, and 2013), the

joint covariance matrix of covariates (Zi) and variables of interest (Xi) can be

presented as:

54

ΣΣxx xz  Equation 3.14 ΣΣzx zz where Σxx is the covariance matrix of the original p variables X, Σzz is the covariance

matrix of the set of q covariates (Z-variables), and Σxz gives their joint covariances.

In order to calculate a covariate-dependent alpha coefficient, the computations essentially require regressing X on Z. It is well-known in the regression literature that

such a regression partitions the covariance matrix Σxx into two parts, the part

−1 −1 ()ΣΣΣxz zz zx predictable from Z and the part ()Σxx −Σ xz Σ zz Σ zx not predictable from Z, that is,

−−11 Σxx =Σ( xx −ΣΣΣ xz zz zx )( +ΣΣΣ xz zz zx ). Equation 3.15

As a consequence, σ ij , the average covariance in Σxx , can also be partitioned as

⊥ZZ() σσij= ij + σ ij Equation 3.16

⊥Z st where σ ij is the average off-diagonal element of the 1 right-hand term in equation

()Z nd (3.15) and σ ij is the corresponding average of the 2 right-hand term. Substituting equation (3.16) into the defining formula for alpha given in equation (3.1), we have

pp2σ 2() σσ⊥⊥ZZ+ () p 2 σ Z p2 σ () Z α =ij = ij ij =ij + ij 2 2 22 σx σ x σσ xx . Equation 3.17 =αα⊥ZZ+ () Hence, coefficient alpha can be partitioned into two additive parts, where one part is free of the covariates and the other part is covariate-dependent. Two major applications of this procedure will be discussed in chapters 9 to 11. One of the main applications concerns the effect of a covariate on the reliability of a scale. The second

55

application concerns applying this method for demonstrating the effect of Common

Method Bias (CMB) on reliability.

3.5.3 Composite Reliability using PLS

All the above mentioned model-based reliability assessments require the use of of reflective measurement models and covariance-based SEM (CB-SEM). However,

CB-SEM is not the only appropriate method for assessments of model-based reliability.

Partial Least Squares (PLS) SEM provides an alternative approach. CB-SEM uses

Maximum likelihood (ML) estimation while PLS-SEM uses partial least squares estimation. PLS- SEM has fewer underlying restrictions than CB-SEM which usually requires normally distributed data and large sample sizes. This composite reliability measure obtained using PLS-SEM will be fully explored in study 3 (chapters 12-14).

3.6 Summary

In this chapter the history of model-based reliability using SEM was critically explored. When Cronbach’s famous article on his coefficient alpha was published in

Psychometrika in 1951, a single general coefficient for assessing internal consistency and reliability became available. Since then, the alpha coefficient has been widely used by researchers in many fields. However, it has been recently criticised by several researchers (Bentler, 2009; Green & Hershberger, 2000; Green & Yang, 2009; Sijtsma,

2009), resulting in recommendations for improvements. Although these recommendations may be useful, there are other methods (such as model-based reliability) that should also be considered.

56

Measures for model-based reliability calculated using SEM include one factor

model coefficients such as rho or ρ11 (Jöreskog, 1971), multi-factor model coefficients such as McDonald’s Omega (ω) (1978), and model coefficients such as Omega

hierarchical ( ωh ), Omega subscales ( ωs ) and Omega total ( ωt ) for bi-factor models (Revelle et al., 2009). Finally, the covariate-free and covariate-dependent reliability coefficients of Bentler (2014) are recent practical methods developed to examine the effects of covariates on the internal consistency of scales using SEM.

Two major recent model-based reliability measurement developments were discussed in more detail in this chapter. They included a) Omega hierarchical and subscale with a focus on their application in bifactor models, b) the covariate-dependent and covariate-free reliability coefficients of Bentler (2014). A third development, the

PLS-SEM procedure for computing Composite Reliability, will be introduced in a later chapter.

Unfortunately, software for calculating these new model-based reliabilities has not routinely been available to scholars, and despite the importance of multidimensional model-based reliability measurement, there is a lack of empirical studies where these coefficients are estimated. Either scholars in the disciplines do not recognise the importance of model-based reliability coefficients based on latent constructs over the classical alpha coefficient, or the appropriate statistical software is still not readily available. For example, except for EQS (Bentler, 2006) which calculates model-based reliability rho and the R psych package (Revelle, 2013), computing omega hierarchical and subscales, most of the packaged software (e.g., SPSS), only provide the classical alpha coefficient calculation.

57

Model-based reliability estimation provides a more accurate representation of the true relative magnitude of systematic variance to total variance in a scale or instrument. Therefore, once a SEM model fits well with its proposed constructs and measured variables, a more accurate representation of its reliability can be obtained using model-based reliability measures.

What comes in the following chapters are the application of these recent developments in practice, with a special focus on bifactor and reflective-formative models. Chapters 4-6 lay the theoretical ground work for these applications.

58

4

THE VALIDITY OF BIFACTOR VERSUS HIGHER-ORDER

MEASUREMENT MODELS

This chapter opens with an introduction to bifactor models. The chapter then considers the application of the bifactor model in an organisational study of Work

Organisation Assessment (WOAQ). Bifactor Models

Constructs are often operationalised as multidimensional units (Diamantopoulos,

2010; Edwards & Bagozzi, 2000). When a number of dimensions or related attributes form a latent factor, it is considered multi-dimensional. In a multi-dimensional construct, dimensions can be conceptualised under an overall concept or a second-order (higher- order) construct (Law, Wong, & Mobley, 1998). In second-order constructs, two levels of constructs exist: the first-order level constructs composed of indicators and the second- order level constructs composed of first-order constructs (Jarvis et al., 2003). Such models are known as hierarchical (higher-order) models.

The majority of researchers in behavioural sciences, by default use higher-order modeling to evaluate multidimensionality. However, this is not the only procedure for evaluating multidimensionality and it may not always be the best way to evaluate a multidimensional model. The use of other approaches, such as bifactor (direct hierarchical order) modeling are not commonly found in the literature (Gignac, 2007; Reise, Moore, &

Haviland, 2010). In a bifactor model, all latent variables are modelled as first-order constructs, in which first-order factors are nested within a general factors (Gignac, 2007;

Gustafsson & Balke, 1993; Holzinger & Swineford, 1937).

59

Perhaps the early roots for the development of bifactor (nested factors) models can be traced back to the work of Holzinger and Swineford (1937) (for full history of SEM development, see Karimi & Meyer, 2014). However, the bifactor approach is not well appreciated in the literature, although there are some important advantages in using this model as an alternative to the conventional higher-order modeling (Gignac, 2007, 2013).

As discussed by Gignac (2007) “the advantages are that bifactor (nested factors) models:

(a) tend to be associated with non-negligibly higher level of model fit; (b) allow for statistical significance testing for all parameter estimates, and (c) allow for less ambiguous interpretations of the factor loadings and the narrow factors ‘nested’ within the higher-order factors (s)” (p, 40). As asserted, one can achieve a better model fit using bifactor modeling.

Imposing fewer restrictions on parameter estimates (as opposed to the number of restrictions required in conventional CFA procedures) improves the validity of the reported results in bifactor models. In addition, the bifactor model provides some evidence regarding the plausibility of the subfactors and the extent of their contribution in a practical sense.

However, the bi-factor procedure is not without disadvantages. One of the main limitations of using nested factor models is that they have fewer degrees of freedom which may lead to model identification problems (Gignac, 2007). However this problem can be managed simply by constraining some of the parameters in the model (Gignac, 2007,

2013). In this study all the latent variable variances are constrained to 1.0 in order to achieve an identified model.

It is evident that, if the aim for the proposed model is to present both multidimensionality and a general single factor at the same time, then a bifactor model is an

60

appropriate procedure to present the model (Reise et al., 2010). Using a bifactor model not only demonstrates the contribution of the items to a general factor (broad construct) but also provide information on the item contributions to sub-dimensions (narrow constructs)

(Reise et. al, 2010).

4.1 Bifactor Model of WOAQ

The Work Organisation Assessment Questionnaire (WOAQ) was developed as part of a risk assessment procedure for stress-related exposures inherent in the manufacturing sector. For a widely-used measure like the WOAQ, using a bifactor model is deemed to be appropriate for several reasons.

First, having a broad or macro level assessment (using a general factor) would help to get an overall picture of the organisation. Conversely, being able to assess the organisation at a narrow or micro level (using subfactors) has practical implications in that specific problematic areas can be identified and addressed. Evaluating the plausibility of subfactors is very important in such contexts, making a direct hierarchical model for

WOAQ a good choice.

Second, as highlighted in recent studies (e.g. Wynne-Jones, Varnaya, Buck,

Karanika-Murray, Griffiths, Phillips, & Main, 2009), the latent structure of the WOAQ in non-manufacturing sectors did not demonstrate a good fit to the model, suggesting that conventional models are inadequate. Model fit is often a problem for the WOAQ when conventional second-order models are considered. In the context of risk assessment in organisations, a tool like WOAQ presents the overall work condition as a general single factor. Additionally, it adds further benefit by highlighting the different subsections of work

61

organisation characteristics. Thus, evaluating the plausibility of subfactors is very important in such contexts, suggesting that a direct hierarchical model for WOAQ would be an appropriate choice.

One of the aims of study 1, therefore, is to compare a bifactor (nested factor) model with a conventional second-order (higher-order) model of WOAQ. This is done in a health setting. This study is expected to open up some empirical and methodological avenues for further developments in this area.

A higher-order model (or full mediation model) and a bifactor model (partial mediation model) of WOAQ can be distinguished, statistically (Gignac, 2007, 2008, 2013;

Yung, Thissen, & McLeod, 1999) and diagrammatically, as illustrated below.

62

Model 1. Higher-order model of WOAQ Model 2: Bifactor model of WOAQ Figure 4.1. Higher-order vs. Bifactor model of WOAQ

63

4.2 Summary

The distinction between a bifactor and a higher-order measurement model was the focus of this chapter. It is evident that, a bifactor model has superiority over a higher-order model when the aim of validating a measurement model is to present not only the multidimensionality and plausibility of the subfactors but also the underlying general factor of the scale on its own. A comprehensive measure of WOAQ, using a bifactor model, offers multiple benefits. Firstly, it demonstrates the contribution of the items to a general factor of

WOAQ. Secondly, it provides information on the item contributions to subscales and indicates the relative importance of the subscales. This procedure has practical implications in organisational studies as it provides the researchers/practitioners with both a broad and a detailed picture of WOAQ in a given setting. The general factor of WOAQ highlights if any problems exist within the organisation. If so, the subscales of WOAQ would highlight the more critical points that need attention.

In Chapters 6 to 8, a bifactor model of WOAQ will be validated and cross validated across gender in a nursing and paramedics setting. The results will be compared with a higher-order model.

64

5

THE VALIDITY OF FORMATIVE MEASUREMENT MODELS VERSUS

REFLECTIVE MODELS

By default, many researchers use reflective models, usually without precise evaluation of the model (Diamantopoulos & Winklhofer, 2001). As a result of ensuing model misspecification, two types of error may be caused (Type I and II errors). Recently researchers in information systems (IS), leadership, management and marketing have highlighted problems of misspecification in measurement model construction

(Diamantopoulos & Winklhofer, 2001; Jarvis, MacKenzie & Podsakoff, 2003; Podsakoff,

Shen, & Podsakoff, 2006).

As a result of this type of misspecification, some of the findings in the literature may be misleading (Jarvis et al. 2003; MacKenzie et al. 2005; Petter, Straub, &Rai, 2007).

As mentioned by Jarvis et al. (2003), construct misspecification issues can lead to “serious consequences for the theoretical conclusions drawn from the model” (p. 212). The extent of this misspecification problem has been studied in several areas but never in organisational psychology (Diamantopoulos & Winklhofer, 2001; Jarvis et al., 2003; Petter et al., 2007).

Therefore the four key aims in this chapter are to:

a) To distinguish between reflective and formative SEM models

b) Review some of the literature to identify the extent of possible SEM measurement

model misspecification problems in the area of organisational psychology,

c) To present an empirical example of misspecification using the Work Ability Scale

(WAS)

65

d) To propose a framework for distinguishing formative from reflective models.

It is hoped that the findings will assist researchers to distinguish which types of measurement models to use for their research. This chapter is organised into four sections based on the above aims.

5.1 Differences between Formative and Reflective Models

In 1973, a Swedish statistician Karl Jöreskog combined the factor analytic work of two psychometricians, Charles Spearman and Louis Thurstone, and the path analysis work of econometrician Sewall Wright, to develop what is now known as SEM (Cunningham,

2008). For nearly half a century, the SEM technique and the computer program LISREL, which was the result of Jöreskog's work, aided the mapping of interrelated constructs in broad areas of study. More programs have since been developed (AMOS, EQS, MPLUS) extending the scope and simplifying the application of this technique.

SEM distinguishes between two different measurement models: reflective and formative. When indicators are affected by a latent variable, reflective models are appropriate. Yet in many settings, indicators may be considered as the cause of latent variables, making formative models more appropriate.

By default, most researchers assume that models are reflective, although many scholars (e.g. Blalock, 1971; Bollen, 1984; Diamantopoulos and Winklhofer, 2001; Jarvis et al., 2003; Petter et al., 2007) have alerted researchers to the relevance of formative models in some specific situations. This advice is unfortunately ignored in much of the research literature.

66

The following section compares reflective and formative models conceptually.

5.1.1 First-order Reflective and Formative Models

In classical test theory, indicators (items) are considered to be dependent on a latent variable, in which case:

x i = λ i ξ+ δ i Equation 5.1

th where, xi (the i indicator) is defined by the latent variable (ξ) , the measurement error (δ i), and the expected coefficient (λ i).

Such measures can be called reflective in that the items are indicators of a latent factor (Fornell & Bookstein, 1982). Such models provide the trigger for reliability evaluation and common/confirmatory factor analysis (Bollen 1989; Long, 1983; Nunnally,

1978). A simple first-order model of reflective measurement is represented in Figure 5.1, in which the latent variable (ξ) is conceptualised as the common cause of three items or indicators, identified as x , x , and x .

₁ ₂ ₃

67

x = λ ξ+ δ

₁ ₁ ₁ x = λ ξ+ δ

₂ ₂ ₂ x = λ ξ+ δ

₃ ₃ ₃ Figure 5.1. First-order reflective model

Conversely, based on the nature of the model, the indicators might cause the construct (Bollen& Lennox, 1991). When the construct is formed from its indicators a formative model is suggested (Fornell & Bookstein, 1982). Equation 5.2 presents an example of a formative model in which a weighted sum of the indicators (Σiλixi), represents the construct (ξ) with an error (ζ):

ξ =Σiλixi+ ζ Equation 5.2 Figure 5.2 presents an example of a first-order formative model in which the causal action flows from the indicators (x , x , x ) to the composite variable (ξ).

₁ ₂ ₃

68

ξ

ξ=λ x +λ x +λ x +ζ Equation 5.3

Figure₁ 5₁.2. ₂First₂ -order₃ ₃ formative model

5.1.2 Higher-order Reflective and Formative Models

The reflective and formative models specified in Equation 5.1 and 5.2 are examples of first-order reflective and formative measurement models. However, constructs are often operationalised as multidimensional units (Diamantopoulos, 2010; Edwards and Bagozzi,

2000). When a number of dimensions or related first-order constructs form a latent factor, it is considered a multi-dimensional construct. In a multi-dimensional construct, dimensions can be conceptualised under an overall concept or a second-order construct (Law, Wong, and Mobley, 1998), or both, as was seen in the bi-factor model in the last chapter.

In second-order constructs, two levels of constructs exist: the first-order level with indicators and the second-order level with first-order constructs (Jarvis et al., 2003). As illustrated in Figure 5.3, a reflective-reflective higher-order model has a reflective model

69

for each of the first-order constructs as well as a reflective model for the second-order construct (η).

ξii= γη + r i Equation 5.4 where the construct η is conceptualised as a second-order latent variable upon which the

ξ r first-order latent constructs ( i ) are dependent with measurement error i for each of these

γ first order constructs and expected coefficients i .

λ X δ r₁ ₁ ₁ ₁

λ ξ1 X δ

₂ λ ₂ ₂

₃ δ γ₁ X₃ ₃ η δ λ

X₄ ₄ ₄ γ₂ λ ξ δ ₅ ₂ X₅ ₅ λ r₂ δ ₆ X₆ ₆

Figure 5.3. Higher-order reflective-reflective measurement model

Higher-order effects between constructs can also be incorporated as a formative model in which: 70

η=Σ+iii γξ ζ

A higher-order formative-formative model is presented in Figure 5.4 as an example.

In the model each first order construct is represented as a formative model while the second order construct (η) is also represented as a formative construct.

r₁ X λ ₁ ₁

λ X ξ1

₂ ₂ λ Ϛ

₃ γ₁ X₃ η

Ϛ λ X₄

λ ₄ γ₂ ξ X₅ λ ₅ ₂ ₆ r₂ X₆

Figure 5.4. Higher-order formative-formative measurement model

71

5.2 Applications of Formative Models

The most common uses of formative models include:

- Creating an induced latent variable

- Creating a block variable

- Illustrating the influence of an experimental intervention on a construct (Edwards &

Bagozzi, 2000).

Creating an induced latent variable is one of the common uses of formative models.

Examples of induced latent variable are presented by Crossley, Bennett, Jex and Burnfield

(2007) in their study concerning the creation of an idea for job embeddedness. Job embeddedness represents “a broad array of influences on employee retention. The critical aspects of job embeddedness are (a) the extent to which the job and community are similar to, or fit with, the other aspects in a person's life space, (b) the extent to which this person has links to other people or activities and, (c) what the person would sacrifice if he or she left”. These aspects are important both on the job and off the job (Holtom, Mitchell, & Lee,

2006, p 320). Composite job embeddedness in this study is operationalised by three main measures: Organisation- and community-fit (“an employee's perceived compatibility or comfort with an organisation and with his or her environment”, p 320); links (“formal or informal connections between an employee and institutions or people”, p320); and sacrifice

(“the perceived cost of material or psychological benefits that are forfeited by organizational departure”, p320). Each construct represents various aspects of job embeddedness. In this example, both constructs define the construct of job-embeddedness, allowing the construction of a job-embeddedness index.

72

Some other examples of induced latent variables are social support indices which include items that capture different aspects of social support (MacCallun & Browne, 1993), like a socioeconomic status (SES) index, created as a function of education, income, job status (Bollen& Lennox, 1991). In this instance, the combination of three diverse variables

(income, education, occupation) allows the construction of an SES index.

First-order formative models can also be used for creating block variables. A block variable is a single construct which summarises the influence of several variables in a block of outcome variable/s (Edwards & Bagozzi, 2000). In such cases, variables which constitute the block variable usually illustrate the distinctive causes of the outcome. This type of formative model was well-illustrated by Howell, Breivik, & Wilcox (2007), using the study of family socialisation by Heise (1972). A block variable called “family socialisation” was introduced by Heise (1972), which was a construct formed by the mother/father’s liberalism, and other unspecified (disturbance) variables (Edwards &

Bagozzi, 2000).

Finally, another common application of formative modeling can be seen in studies which involve intervention and the assessment of intervention effects on a construct

(Bagozzi, 1977; Costner, 1971). For instance, in an experimental study (Conster, 1971), a fatigue construct was manipulated by depriving participants of sleep (indicator). In such experimental studies that involve intervention, the measures can be considered as formative constructs (Edwards & Bagozzi, 2000; Bagozzi, 1977; Costner, 1971).

73

5.3 Developing a Framework for Distinguishing Reflective- Formative Models

In this section an attempt is made to develop a clear and well-defined decision- making framework for assessing whether a reflective or a formative model is appropriate.

Unfortunately, there is little information or practical guidelines for distinguishing the reflective and formative models. The major work in this area was introduced by Jarvis et al

(2003), and Diamantopoulos and Winklhofer (2001), and then extended by Petter, Straub, and Rai (2007) and Coltman, Devinney, Midgley, and Venail (2008).

What is presented here is a practical decision-making tree for evaluating reflective and formative models of measurement, based mainly on a review of the works of Jarvis et al (2003), Petter, Straub, and Rai (2007) and Diamantopoulos and Winklhofer (2001).

The background theory. The first step in identifying formative vs. reflective models is to refer to the relevant background theory, to determine whether a construct is typically viewed as a formative or reflective construct. This is usually considered to be the best way of distinguishing between formative and reflective models. If there is doubt in the literature or there are no solid theoretical frameworks available, then the following criteria might help researchers in distinguishing between formative and reflective models. These criteria are based mainly on the guidelines proposed by Jarvis, MacKenzie and Podsakoff (2003).

Direction of causality. The next step involves consideration of the direction of causality between each construct and their indicators. As suggested by Jarvis et al (2003), the researchers need to know, in the first instance:

1) Whether the items explain the latent factor, or if the latent factor represents the

indicators. In formative models, the indicators influence the latent factor or 74

“composite” variable (MacKenzie et al. 2005). But if the latent variable is fully

derived by its indicator items that manifest or represent the latent factor, a reflective

model is suggested.

2) The nature of changes in the latent factor. In formative models, the measurement

error is at the factor level; the latent factor is partially explained by random error

and is not fully explainable by its items. Any change in an item would lead to a

change in the latent factor, but not vice versa. The opposite is true in reflective

models; the measurement errors are at the item level, therefore, any change in the

indicator does not necessarily result in a change in the latent factor. However, any

change in the latent factor would result in a change in the items (Jarvis et al, 2003;

Petter, Straub, & Rai, 2007).

The interchangeability of the measures. The third step involves examining the interchangeability of the measures (Jarvis et al, 2003):

1) The similarity of contents of the indicators. In reflective models, measures are

interchangeable and follow a common theme. Employing different themes suggests

formative measures which are not interchangeable.

2) Changes in the indicators. In formative measures, the latent factor is explained by

its items; removing any item of a formative factor would influence the meaning of

the latent or composite factor. In reflective models, however, removing an indicator

would not affect the meaning of the latent factor because they are outcomes of the

construct and not the cause (Jarvis et al, 2003; Petter, Straub, &Rai, 2007).

75

Co-variation among measures. The fourth step involves consideration of the correlations among indicators; in other words, would variation in one indicator be correlated with the variations in other indicators (Jarvis et al, 2003). In formative models, because a construct is formed by different indicators, high correlations between the indicators are not expected. The indicators in such models might represent totally different content. However, with reflective models, because the indicators are presented by the latent factor, high correlation between indicators is required. This suggests multicollinearity which seems to be desirable for reflective measures. That is why establishing an acceptable level of internal consistency is required for reflective models while it is not really appropriate for formative models.

Nomological net of the latent factor indicators. The final decision rule is based on the following criterion for reflective models: The same antecedents and consequences of.

With formative constructs, it is not expected that the observed variables have similar predictors or outcomes. This is because the composite factors are formed by indicators that are not necessarily correlated nor do they necessarily share the same content. Conversely, with reflective models, due to the interchangeability of reflective indicators, the same patterns of antecedents and consequences are expected for all indicators (Jarvis et al., 2003;

Petter, Straub, &Rai, 2007). Depending on the extent to which this criterion is met, the researcher will be able to decide if it is a reflective or formative construct.

A summary of the above is presented in Figure 5.5 in the form of a decision tree.

These decision rules will be used in an examination of the organisational psychology literature in the next section. However, although using this guideline helps to identify a

76

reflective or formative construct, in practice many constructs are mixed. In other words, a construct has some items consistent with formative constructs and other items which are consistent with reflective constructs.

77

yes yes

Reflective model Direction of causality Formative model

The interchangeability of the indicators

Co-variation among measures

Nomological net of the factor

Figure 5.5. The developed framework for assessing the formative vs. reflective measurement models. Acknowledgement: The main contents of this framework are built based on the guidelines proposed by 78 Jarvis et al., (2003), Journal of Consumer Research, 30.

5.4 Measurement Model Misspecification in Organisational Psychology Literature

5.4.1 Empirical Evidence on Measurement Model Misspecification

In recent times, some researchers have focussed on misspecification in measurement models. One of the earliest studies which highlighted misspecification in formative constructs is that of Jarvis et al. (2003). In this study, the extent of misspecification was assessed by reviewing four marketing journals (Journal of Marketing,

Journal of Marketing Research, Marketing Science and Journal of Consumer Research). A

29 per cent model misspecification rate was reported.

Fassot followed in 2006 (cited in Diamantopoulos, Riefler, & Roth, 2008), reviewing three German management journals (Zeitschrift für Betriebswirtschaft,

Zeitschrift für etriebswirtschaftliche Forschung and Die Betriebswirtschaft). A 35 per cent misspecification rate was reported.

In a similar process, Podsakoff et al. (2006) reported a misspecification rate of 62 per cent after reviewing the three most important strategic management journals (Academy of Management Journal, Administrative Science Quarterly and Strategic Management

Journal). Similar results were also reported for leadership research (47 per cent misspecification) by Podsakoff, MacKenzie, Podsakoff and Lee (2003) based on articles published in The Leadership Quarterly, Journal of Applied Psychology, and Academy of

Management Journal.

Petter et al. (2007) examined complete volumes of MIS Quarterly and Information

Systems Research over three years. The study reported a 30 per cent misspecification for formative constructs. 79

In more recent times, Roy et al. (2012) reviewed four journals in the area of production, manufacturing and operations management (Journal of Management Science,

Journal of Operations Management, Decision Sciences Journal, and Journal of Production and Operations Management Society) published between 2002 and 2006. They reported a misspecification rate of 42.5 per cent.

In summary, the existing studies show a significant degree of misspecification in the disciplines of information systems (IS), leadership, management and marketing. The question is: “To what extent does misspecification exist in other disciplines such as psychology?” To the researcher’s knowledge, no such study has been conducted in the area of organisational psychology, hence the need for this study. The hypothesis of this study is that:

Hypothesis 5.1: There is some degree of misspecification in formative vs reflective

measurement models in the organisational psychology literature.

5.4.2 Literature review strategy

Initially, the methodology for identifying misspecification will be discussed based on the recent organisational psychology literature. To assess the prevalence of measurement model misspecification, articles published within a nine-year period between 2006 and

2014 in two high profile journals - Journal of Applied Psychology and Personnel

Psychology - were reviewed. While this is not a broad review of the literature; it is reasonable to assume that a problem exists if construct misspecifications are found in these

80

most cited journals in the discipline. As a result, there is a likelihood of Type I or II errors in reported results in misspecified models. If a problem of misspecification exists, then there is a need to take action and pay more attention to this neglected area of study.

The following inclusion criteria were followed in this review:

- Papers with measurement models - Constructs measured by two or more items.

The exclusion criteria were as follow:

- Papers consisting only of single-item measures - Papers that did not report their measurement items.

Based on these criteria, a total of 301 studies were considered in the analysis (See

Appendix H). The measurement items for each construct were examined by two researchers independently, using the decision making framework provided in Figure 5.5. If both researchers agreed that at least one construct was misspecified (e.g. modelled as reflective while it should be formative or vice versa), the study was coded as misspecified, on the grounds that misspecification for any construct could lead to error.

5.4.3 Inter-rater Reliability

IBM SPSS Statistics (SPSS) for MS Windows Release 21.0 (SPSS Inc., Chicago,

IL) was used to analyse this data. A Cohen’s Kappa was applied to measure the inter-rater reliability of the decision for the two researchers (the student and her principal supervisor), both experts in SEM and Organisational Psychology studies. Using only two researchers to rate the measures is considered to be one of the limitations of this review. All the papers were examined and the appropriateness of formative and reflective models was judged by

81

both raters in each case. The Cohen’s Kappa test examines the level of agreement between raters, with a result of higher than 0.70 indicating good agreement between raters.

5.4.4 Results of the Review

A high level of agreement was obtained between the raters (Cohen’s Kappa=0.89), suggesting that the classification of the articles based on the guidelines provided in Figure

5.5 was reliable. The findings of this review are summarised in Table 5.1.

Table 5.1 Measurement Model Classification

Should be Should be Should be Total Reflective Formative Mixed

Modeled as 215 39 16 270 (90%) Reflective Modeled as 0 21 0 21 (7%) Formative Modeled as Mixed 0 0 10 10 (3%) Total 215 (71 %) 60 (20 %) 26 (9 %) 301(100)

* A total of 301 studies from articles published in the Journal of Applied

Psychology and Personnel Psychology between 2006 and 2014 were reviewed.

A misspecification level of 18 per cent (55/301) was found in this review. The misspecification involved misspecifying a formative model as reflective or a mixed model as a fully reflective model. Not surprisingly, the majority of the studies (90%) by default considered measurement models as reflective. Unfortunately, there is no similar misspecification study in this area to allow a comparison, however, higher percentages have

82

been found in other disciplines, as explained previously. As mentioned previously the results of such misspecification in measurement models can lead to Type I or II errors.

5.4.5 Discussion

The issue of measurement model misspecification is a very critical topic in measurement models. As mentioned before the majority of scholars by default consider measurement models as reflective which leads to misspecificiation. As indicated in previous studies (Jarvis et al., 2003; Petter et al., 2007; Roy et al., 2012), model misspecification can bias the parameter estimation leading to Type I and II errors and incorrect conclusions. Although a higher degree of misspecification has been reported in other disciplines (e.g. Jarvis et al., 2003; Podsakoff et al., 2006; Petter, Straub and Rai,

2007; Roy et al., 2012), the finding of an 18 per cent reported misspecification rate in two prestigious organisational psychology journals is nevertheless significant. If such a high percentage of misspecification is found in top-ranked journals, the researchers predict significantly higher misspecification rates in journals with less influence.

Given the reported problem of misspecification in the field, greater attention to measurement model specification is imperative. A lack of awareness about the nature of formative constructs could be one of the reasons for misspecification. As demonstrated by previous studies (e.g. Jarvis et al., 2003; Petter et al., 2007) and as shown in Table 5.1, in all the misspecified studies, researchers had miscategorised formative constructs as reflective rather than the reverse.

What is needed is a simple but comprehensive framework to distinguish formative and reflective measures. Also, it is important to ask why formative models are frequently

83

misspecified as reflective models. The problems that occur with the fitting of formative models are partly to blame. Overall it is easier to fit reflective models. This topic is discussed later using empirical examples in the context of a work ability measurement model.

The review however is not without limitations. One of the main limitations in this review is using only two researchers for rating the measudmrnet models. Reviewing only two journals also considered as another limiattions of the review which limits the generalizability of the results.

5.5 Summary and Conclusion

In this chapter an informative introduction was provided to distinguish formative from reflective measurement models. A simple and easy to understand framework for distinguishing formative models from reflective models was proposed in this chapter. The misspecification of formative vs reflective models along with the possible outcomes of misspecifications were discussed Then it was demonstrated how big the problem of formative model misspecification is in the organisational psychology discipline. Using a comprehensive literature review of misspecification over a 9-years period in two high ranked journals in the discipline of organisational psychology, the misspecification rate was demonstrated for the first time in this discipline.

In study 3 an example of misspecification involving the measurement of work ability using the WAS measure is presented. In this study it will be empirically demonstrated how different model specification/misspecification can yield different results for a measurement model. The initial second-order WAS model will be re-examined using

84

reflective-reflective, formative-formative and reflective-formative models. Based on the guidelines provided in Figure 5.5, and theoretical background, the model should be fitted as reflective-formative (reflective for first-order constructs and formative for the second-order construct). In Chapters 12 to 14, therefore, the validity and reliability assessments of the correctly specified model of reflective-formative models of WAS is conducted using Partial

Least Squares SEM. The results will be compared and discussed with those obtained from the misspecified reflective-reflective and formative-formative model of WAS, along with the implications for the discipline

85

6

STUDY 1: MODEL-BASED RELIABILITY, VALIDITY AND CROSS

VALIDITY OF BIFACATOR MODEL FOR WOAQ

In chapters 6 to 8 the validity, cross validity and model-based reliability of the

Work Organisation Assessment Questionnaire (WOAQ) is assessed for a sample of nurses and a sample of paramedics using a bi-factor model. This chapter introduces the data and the relevant theory and hypotheses, Chapter 7 reports the results and Chapter 8 discusses and summarises the implications of the results.

The Work Organisation Assessment Questionnaire (WOAQ) was previously validated by Griffiths, et al. (2006) in a study involving manufacturing workers. In this study, WOAQ was viewed as a bi-factor measure, including a general measure of the Work

Organisation Assessment Questionnaire (WOAQ) and five nested subfactors, with each subfactor representing different dimensions of work organisation risk assessment. The five nested subfactors are: quality of relationships with management, reward and recognition, workload issues, quality of relationships with colleagues, and quality of the physical environment.

This study is the first of its kind to be conducted on two groups of employees in an

Australian health setting. As mentioned in Chapter 4, in recent years bifactor modeling is gaining popularity among the scholars of different disciplines. However, applications of the bifactor model in the field of organisational psychology have been very limited. Lack of knowledge or lack of information on the advantages of a bifactor model over a higher order model in specific contexts may be among the main reasons for this neglect (Reise, 2012).

86

The focus of study one is validation of the Work Organisation Assessment Questionnaire

(WOAQ) which was originally proposed by Griffins et al in 2006. Although the scale has been used in many studies since its development, there is little work evaluating its validity.

Among these studies, a poor fit is reported for WOAQ using a second-order model and none of these studies employed bifactor modeling for validity assessment. In this research,

Study 1 consisted of three sub-studies which are discussed below.

a) Validation of the Work Organisation Assessment Questionnaire (WOAQ) for

nurses. This study presents a validity assessment of the WOAQ for nurses, in

which a conventional second-order model of WOAQ will be compared with a

bifactor model. No other such study has been undertaken in an Australian health

setting using such a broad and rigorous examination of model-based reliability and

validity procedures. In particular, the bifactor model of Work Organisation

Assessment Questionnaire (WOAQ), including its general factor and five sub-

factors, will be assessed in terms of construct validity.

b) Model-based reliability of WOAQ. The conventional coefficient alpha reliability

measures for WOAQ will be compared with the model-based reliability of Omega

total, Omega hierarchical and Omega subscales. Based on the literature on

multidimensional scales, the coefficient alpha overestimates reliability. Model-

based reliability coefficients are expected to provide more accurate reliability

measures for multidimensional models and/or when the assumptions for coefficient

alpha are not met.

87

c) Cross validation of the Work Organisation Assessment Questionnaire (WOAQ)

across gender among paramedics. The final section of study one considers the

cross-validity of the Work Organisation Assessment Questionnaire (WOAQ) across

gender considering only the paramedics sample. The best fitting model of Work

Organisation Assessment Questionnaire (WOAQ), obtained in the assessment of

cross-validity across the nurse and paramedic samples, will be tested for invariance

for males and females using the MACS procedure. This allows a statistical

comparison of factor structures and observed means. MACS was first introduced

by Sörbom (1974) for the cross validation of SEM models. However, practical use

of this procedure is often neglected in the literature.

This chapter along with Chapters 7 and 8 present the rational and objectives, the methodology used, the results and discussion of the findings, the unique strengths of the research, and possible directions for future related studies.

6.1 Rational and Objectives

6.1.1 Validity of Bifactor Model of WOAQ

One of the greatest challenges for society is sustaining an individual’s health and quality of life in the workplace (Cox, 1997). There is a broad body of research revealing damage to health and wellbeing in workplaces. Increasing awareness of the possible deleterious effects of work related factors on health has led to the enforcement of regulations and the introduction of legislation in many developed countries to ensure that organisations put the health of their employees as a high priority (Faragher, Cooper & 88

Cartwright, 2004). As a result, management has also been encouraged to conduct risk assessments for psychosocial hazards with a view to ensuring employees’ health and safety in the workplace (Rick & Briner, 2000).

There are several driving forces which contribute to making the workplace a less convivial place for employees to work in. For instance, the growing competitiveness of the marketplace, the constant need to improve organisation efficiency and profitability and radical changes in employment conditions are amongst the major driving forces responsible for increasing stress in the workplace (Faragher, Cooper & Cartwright, 2004). But in particular, an inability to incorporate proper work design in the workplace leads to a negative effect on both employees and organisations (Griffiths, et al., 2006).

Much of the attention in the occupational health and safety (OH&S) literature has been focusing on linking this inability to incorporate a suitable work design with the right assessment tools and decreasing negative work related outcomes for individuals and organisations (Griffiths, et al., 2006).

The efficacy of an OH&S tool in assessing the risk factors in the workplace environment depends on how well it is designed, implemented, and developed. A more practical approach is required in order to obtain information from relevant respondents, taking into consideration the nature of their work (LaMontagne, 2004).

Adapting such approaches to a specific work context provides a benchmark which can be used to identify the main organisational hazards and to progressively improve OHS by improving safe work design and practices. The main challenge is to use a suitable instrument to improve the capture of OH&S indicators.

89

Based on recommendations from previous studies, a good risk assessment process can only be achieved by using multiple methods of assessment. A well-designed assessment should recognize the risks in the workplace and also the employees at risk (The

Health and Safety Executive Guidelines, 2000). The organisational risk assessment is obtained using questionnaire/survey scales. In order to evaluate risk and stress effectively, this questionnaire must meet some important criteria such as being reliable and valid; easy to complete; measuring the possible risks, their predictability of outcomes related to the employees’ health, their size and impact on the target population; and applicable to both organisations as a whole and at different work levels. To be able to meet such criteria, the questionnaires are usually quite lengthy. As a result, the large amount of time it takes to complete a questionnaire leads to a low response rate (Faragher, Cooper & Cartwright,

2004).

A short yet comprehensive risk assessment questionnaire is desirable. One such instrument called Work Organisation Assessment Questionnaire (WOAQ) developed by

Griffiths, et al., (2006) may be able to overcome problems identified in previously validated measures due to its short length and yet comprehensive content. The methodology developed in WOAQ was based on identifying and collecting employees’ opinions on their work, health, and their workplace design and management (Griffiths, et al., 2006). It was designed to measure risk factors pertaining to the work design and management which may influence employee health and health related behaviours in a manufacturing setting

(Griffiths, et al., 2006; Wynne-Jones et al., 2009). The overall score on WOAQ indicates the extent to which the respondents believe that these dimensions of work are good and can

90

be used as predictors of wellbeing, subjective health and job satisfaction. A high score on

WOAQ indicates that the respondents perceive dimensions of work as good, and a low score on WOAQ indicates that the respondents perceive dimensions of work as problematic

(Griffiths, et al., 2006).

The WOAQ was initially developed for a manufacturing setting and implemented in the private sector; however the comprehensive approach to the risk assessment means that this questionnaire may be used in other settings including non-manufacturing or health settings.

It is therefore important to check if the WOAQ can be implemented effectively in other work settings or professions (Wynne-Jones et al., 2009). Only a few studies have evaluated the application of WOAQ in other workplaces. For example, Wynne-Jones et al.,

(2009) in their research of two large public sector organisations in South Wales, evaluated the validity and reliability of WOAQ in the public sector. Using a higher order CFA, the researchers only found a marginal fit for the original five subfactors of WOAQ. In the end they identified a two-factor structure linked to four of the five scales of the WOAQ, assessing Management and Work Design, and Work Culture. One of the aims in this study is therefore to find out if the general and five subfactors of WOAQ can be implemented in a non-manufacturing, health setting in Australia. Also, in addition to the conventional higher order model of CFA used frequently by other scholars in the field (including the Waynne-

Jones’s study), a more practical bifactor model will be used to assess the general factor of

WOAQ and the plausibility of its five subfactors in an Australian community nursing

91

setting. As fully discussed in Chapter 4, a bifactor model of WOAQ is deemed to deliver a better fit and more valuable information in such contexts. It is therefore hypothesised that:

Hypothesis 6.1. A bifactor model of WOAQ has acceptable construct validity in a non-manufacturing, health setting in Australia.

Hypothesis 6.2: A bifactor model of WOAQ has superior fit over the conventional higher order, five-factor model of the WOAQ.

In this study covariance-based SEM is used to fit reflective models to the WOAQ.

This allows the evaluation of model fit using conventional goodness of fit measures. In addition it allows the extraction of model-based measures of reliability. It also allows the use of invariance tests for comparing the cross-validity of models for different groups (e.g. nurses and paramedics, males and females) as described below.

6.1.2 Model-based Reliability

One of the commonly used measures for reliability is coefficient alpha which was proposed originally by Cronbach in 1951. Coefficient alpha was developed for only one- dimensional scales and is therefore not appropriate for multidimensional constructs as discussed previously (Sijtsma, 2009; Zinbarg, Revelle, Yovel, & Li, 2005). In the case of multidimensional scales, coefficient alpha may lead to overestimation of the reliability

(Cortina, 1993).

However, model-based reliability assessments for multi-dimensional scales were provided many years ago by Bentler (1968) and Heise and Bohrnstedt (1970) and, more recently, by Bentler (2007, 2009) for factor analytic types of models, and, in a generalised

92

form, for any structural equation model with additive errors. Although reliability for a general SEM model is rationalised based on the model’s multidimensional structure, it should be noted that a uni-dimensional model-based coefficient, which we will call ρ or rho, still quantifies the proportion of variance due to the most reliable single dimension in multidimensional space (Bentler, 2007). However, there are a few empirical studies that have also reported reliability coefficients such as omega hierarchical, omega subscale and omega total that are suitable for bi-factor models with multiple subscales (e.g.Gignac &

Watkins 2013; Reise, Bonifay, & Haviland, 2012; Zinbarg et al., 2012).

For the purpose of this study, both traditional (conventional) estimates (i.e. the conventional Coefficient alpha) and more modern model-based reliability estimates of

Omega (i.e. omega hierarchical, omega subscale and omega total) will be assessed and compared for the bifactor WOAQ model.

Omega hierarchical ( ωh ) estimates the degree of proficiency of a general factor test measure in a bifactor model (Revelle, & Zinbarg, 2008; Zinberg, Revelle, Yovel, & Li,

2005). It is a measure of the variance in total scores that arise from the general factor

ω running across all the items (Reise, Bonifay, & Haviland, 2013). Omega subscale ( s ) is used to determine the degree of reliability of the proposed subscale scores after controlling

ω for the variance generated from the general factor (Reise et al., 2012). Omega total ( t ) estimates the combined reliability for the general factor and the subscales (McDonald,

1978).

93

Reporting both conventional item based reliability estimates as well as all three types of omega model-based reliability measures for a bifactor model will provide a more detailed and comprehensive evaluation of the reliability for WOAQ. All the previous studies on WOAQ reported only single item-based reliability measures which are not appropriate given the multidimensional nature of the scale. These reliability measures will be calculated and compared for the sample of paramedics.

It is hypothesised that:

Hypothesis 6.3: The model-based Omega reliability coefficients will provide acceptable levels of internal consistency for the bifactor model of WOAQ for a sample of paramedics.

Hypothesis 6.4: The conventional internal consistency reliability of alpha coefficient overestimates the reliability of the WOAQ scale compared to the model-based reliability coefficients of Omega for a sample of paramedics.

6.1.3 Cross Validation of Bifactor Model of WOAQ

An important aspect of a tool’s psychometric properties is its cross validity, or whether the tool has good fit in other groups of individuals or populations. Once the validity of a tool is established, it is time to assess its cross-validity. The main issue of cross-validity is whether the validated tool fits well in more specific populations.

Measurement Model Invariance can be tested for two or more distinct samples using CFA.

If the model fits well in all the samples then it can be concluded that the model is acceptable and valid across the corresponding populations. However, it is also necessary to

94

test whether the population parameters can be considered equal for all the samples. There are several procedures for evaluating cross-validity in CFA models, each relating to a hypothesis for a different set of key population parameters (e.g. Meredith, 1993; Widaman and Reise, 1997; Bryne, 1995; Bryne & Watkins, 2003, Cheung & Rensvold, 2002). Little

(1997) categorises invariance testing into two major categories. The first type of invariance procedure refers to evaluating the psychometric characteristics of the model parameters

(e.g. factor loading, measured-variable loading, variance/covariance of errors or factor residuals) using analysis of covariance (COVS). This type of invariance must be established before progressing to category two of invariance analysis, relating to invariance in factor means (Cheung & Rensvold, 2002; Widaman and Reise, 1997). The invariance analysis of mean and covariance structure (MACS) procedures for latent constructs was first introduced by Sörbom (1974) for the cross validation of SEM models. Although category one invariance testing (COVS) of measured (observable) parameters described by

Little (1997) is widely demonstrated by researchers, few scholars have paid attention to the category two invariance testing of MACS (Cheung & Rensvold, 2002; Chen, Sousa, &

West, 2005; Vandenberg & Lance, 2000). In this study the invariance of all parameters for the bifactor WOAQ model will be assessed between males and females in the paramedics’ sample.

The main technical aspects of invariance procedures, at both measurement and construct levels, were introduced by Meredith (1993), Widaman and Reise (1997) and

Meredith and Horn (2001). The main limitation of the techniques they have introduced is their design for only first-order models. For more complex bifactor or higher order models,

95

the literature on practical assessment techniques is more limited. As mentioned by Chen,

Sousa, and West (2005), previous scholars have not paid enough attention to the invariance testing of bifactor or higher order models (e.g. Byrne, 1995; Byrne & Campbell, 1999;

Marsh & Hocevar, 1985). In particular, this is true in the case of MACS invariance tests for bifactor models.

Thus, in this part of the study, invariance testing will be conducted on a bifactor model of the WOAQ at both parameter and construct levels, building on the recommendations of previous scholars (Cheung and Rensvold, 2002; Byrne, 1995; Byrne and Watkins, 2003; Meredith, 1993, Meredith & Horn, 1997; Widaman & Reise, 1997 and

Chen, Sousa, & West, 2005; Yap et al., 2014).

In order to assess the validity and the cross-validity of the WOAQ across the data in this study, three analyses will be carried out across gender for the bifactor model WOAQ with its five nested factors. In the first set of analyses, the validity of the bifactor model of

WOAQ will be independently tested for male and female employees from a paramedic organisation. Invariance testing for the measures will be carried out at the second step. If the model shows satisfactory invariance of the measures, then in the final analysis the construct means can be tested for invariance using the MACS approach. In this analysis only the paramedic sample is considered because there were too few males in the nursing sample to make a valid comparison.

It is evident that any occupational safety and health interventions should be beneficial for both males and females. However, gender mainstreaming, or the gender- sensitive approach to occupational health and safety (OH&S), has been recognised as

96

important to the OH&S agenda by the European Commission in its community safety and health strategy 2002-06 (EU-OSHA – European Agency for Safety and Health at Work,

2014). Both male and females employees can better benefit from interventions aimed to improve their health that are developed based on gender-sensitive approach. To obtain such equality in OH&S, it is critical to recognise the gender differences and as a result the differences in their work organisations and the way they perceive working conditions. This should not just be limited to the physical elements (such as designing safety gear specifically fitted for women) but should also take into account the psychosocial elements of the work setting. Therefore, in recognition of the importance of a gender-sensitive approach, the main goal of this study is to examine whether work characteristics are experienced in the same way by both genders and, in this way, validate the WOAQ across male and female paramedics.

Therefore it is hypothesised that:

Hypothesis 6.5: Baseline invariance. There is a baseline invariance of the bifactor

CFA model of the WOAQ in that the model describes both female and male paramedics.

Hypothesis 6.6: Configural invariance. There is a configural invariance of the bifactor CFA model of the WOAQ across gender in that the model describes the combined data set well.

Hypothesis 6.7: Invariant factor loadings. The bifactor CFA model of the WOAQ exhibits invariance across gender, even after constraining the factor loadings on observed variables to be equal for males and females.

97

Hypothesis 6.8: Invariant factor means. The factor (construct) means of the bifactor

CFA model of the WOAQ are invariant for male and female paramedics.

6.2 Method

The data collection for both studies of nurses and paramedics along with the measures, ethical considerations and data analysis are described below.

6.2.1 Nursing Participants

Data were collected from a sample of Australian nurses for the validation of the

WOAQ. The study design was cross-sectional. A self-report questionnaire was used to capture demographic-work characteristics and the WOAQ described below.

A questionnaire package that included a cover letter, information sheet, consent form, questionnaires, and reply-paid envelopes was forwarded to all potential participants.

Three weeks after the mail-out, a letter was forwarded to the employees to thank them for their participation, or to ask if they could complete and return the questionnaire if they had not already done so. A total of 334 surveys were returned. Some of the returned surveys were incomplete with a high percentage of missing data, therefore the decision was made to remove these incomplete surveys. After data cleaning and removing the incomplete data,

312 surveys were included in the final data analysis.

98

6.2.2 Paramedic Participants.

The paramedic data was collected from a large Australian health organisation employing paramedics3. The study design was cross-sectional. A self-report electronic questionnaire was used to capture the variable of interests anonymously. Nine hundred and seventy nine responses were received from the paramedics. Of these, 33 were from volunteer paramedics which were excluded from the final database.

6.2.3 Measures

The measurement scale used was the comprehensive Work Organisation

Assessment Questionnaire (WOAQ) consisting of 28 items pertinent to aspects of the respondents’ work organisation (Griffiths, et al. 2006). Respondents were asked to rate how problematic or good each of the items were for them in the last six months, with higher scores representing better quality of work environment. It was assumed that the WOAQ consisted of a general 28-item summative factor with a five sub-factor structure. The five- factor structure of the scale included: workload issues, reward and recognition, quality of relationships with management, relationships with colleagues, and physical environment.

3 The data was collected as part of a study on the prevention of work-related musculoskeletal disorders and the development of a tool kit for workplace users. For this study, psychosocial workplace hazards were recognised as a significant predictor of discomfort /pain levels and absenteeism due to sickness (Jodi & Macdonald, 2012). Therefore, data on the WOAQ which was collected as part of quantifying the psychosocial workplace hazards in that study was also used in this study for evaluating covariate- dependent reliability and cross-validity.

99

6.2.4 Ethics

Human Research Ethics Committee approval was obtained from both the lead university and the participating nursing and paramedic organisations.

6.2.5 Overview of Statistical Analysis

Normality of the data was assessed before conducting the CFA at both item level and group level. At the first step of the validation process, the construct validity of a bifactor model of WOAQ was compared with a higher order model. Although the responses are captured on a 5-point ordinal scale they are treated as continuous normally distributed variables. This is a limitation of the analysis although ordinal variables with five categories, are usually treated as “continuous.” There is some evidence to support that it is unlikely that this will have any significant practical impact on the results (e.g., Babakus, Ferguson,

& Jöreskog, 1987; Dolan, 1994; Johnson & Creech, 1983; Hutchinson & Olmos, 1998;

Rhemtulia, Brosseau-Liard, & Savalei, 2012). As demonstrated by some simulations studies (Rhemtulia, Brosseau-Liard, & Savalei, 2012), for five to seven categories, robust continuous methods of estimation, such as Maximum Likelihood (ML), will deliver similar outcome as categorical methods of estimation such as categorical Least Squares (cat-LS).

Also, as asserted by Rhemtulia, Brosseau-Liard, and Savalei (2012), the continuous methods of estimation are very familiar for researchers while there is limited knowledge on estimation methods for categorical data.

An important factor that was considered in choosing the suitable fit indices was the degree of penalty included for model complexity. Based on the suggestion by scholars (e.g.

Gignac, 2013), for evaluation of bifactor models, it is better to choose those close-fit

100

indices that include relatively greater penalties for model complexity (i.e. RMSEA,

NNFI/TLI, & AIC).

The fit indices reported in this study are summarised as follows:

- The root mean square error of approximation (RMSEA)

- The Tucker-Lewis Index (TLI) or Non-normed fit index (NNFI)

- The Akaike Information Criterion (AIC)

RMSEA values of less than.08, and .05 (MacCallum, Browne, and Sugawara, 1996) and NNFI values of greater than 0.90 and 0.95 (Hu & Bentler, 1999) were considered as marginal and good fit levels respectively . The model comparisons will be performed based on a practical improvement in NNFI. NNFI reductions of at least .010 show significant model improvement according to Vandenberg & Lance (2000).

The Akaike Information Criterion (AIC) is a comparative measure of fit which is meaningful only when two different models are estimated. A smaller value of AIC and a reduction (ΔAIC) of more than 10 indicates a superior model fit (Akaike, 1973; Raftery,

1995; Schwarz, 1978).

The chi-square goodness of fit test was also reported as the conventional, commonly reported measure of fit in the literature. Traditionally, a chi square statistic is used for assessing if the proposed model describes the data adequately. However, as acknowledged by Hu & Bentler (1999), the chi square statistic is highly dependent on sample size and is not appropriate for complex or non-normal data. The relative chi square (Chi-Square/DF) is therefore preferred as a measure of model fit. For this statistic a value of 1 to 2 reflects

101

good fit, less than 3 represents acceptable fit (Kline, 1998), and less than 5 represents adequate fit (Schumacker & Lomax, 2004).

The other commonly reported fit indices are the Standardized Root Mean Square

Residual (SRMSR) and Comparative Fit Index (CFI), however neither of these fit indices were considered in this study because they do not adequately penalise for model complexity (Marsh, Hau,& Grayson, 2005; Gignac, 2013).

6.2.6 Model-based reliability

Model-based reliability coefficients of omega total, omega hierarchical and omega subscales, and conventional item-based coefficient alpha will be used for testing the reliability of the WOAQ. Only R Psych package (Revelle, 2013) calculates these omega coefficients directly. In other SEM software such as AMOS and EQS, omega coefficients can be calculated indirectly using what is known as a reliability index (Fan, 2003), which is in fact the implied correlation between a latent variable and its corresponding composite score (Gignac, 2007).

As recommended by Gignac (2014) a practical approach for the estimation of ωh

and ωs is “to estimate the (squared) correlation between latent variables within a bifactor model and their corresponding equally weighted composites scores (known as phantom variables) within structural equation modeling programs” (p. 9). Figure 6.1, demonstrates an example for this procedure using EQS. The confidence intervals associated with the reliability coefficients in this procedure can also be evaluated using a combination of the phantom variable squared correlation approach and bootstrapping. Due to an identification

102

problem (having only two indicators for the ‘relationship with colleagues’ construct), the method could not be used in this study. Instead, using an excel spreadsheet and the formulas for the omega coefficients, they were calculated manually. Using the factor loadings and error variances of the well-fitting WOAQ, the coefficients of omega were calculated.

The formulae for Omega hierarchical (ωh) and Omega subscale (ωs) are provided below in the case of items (i = 1, 2,…,k=28) contributing to a general factor with loadings

λgi and five subscales with loadings λSj,i for i =1, 2,… SJ for each subscale Sj j=1,2,..5.

2  k  ∑λgi  ω =  i=1  Equation 6.1 h 2 2 2 2  k   S1   S 2   S 5  k ∑λgi  + ∑λS1i  + ∑λS 2i  + ... + ∑λS 5i  + ∑Var(ei )  i=1   i=1   i=1   i=1  i=1

2  Sj    ∑λSji  ω =  i=1  Equation 6.2 sj 2 2  Sj   Sj  Sj     ∑λgi  + ∑λSji  + ∑Var(ei )  i=1   i=1  i=1

where the items i=1,2,…Sj all belong to the Sj scale for j=1, 2, ..5. Combining these reliabilities the total reliability of the 5-factor measurement model is measured using

Omega total (ωt)

103

2 2 2 2  k   S1   S 2   S 5  ∑ λ + ∑ λ + ∑ λ + ... + ∑ λ  gii   S1ii   S 2ii   S 5ii  ω =  i=1   i=1   i=1   i=1  t 2 2 2 2 Equation 6.3  k   S1   S 2   S 5  k ∑ λ + ∑ λ + ∑ λ + ... + ∑ λ + ∑Var(e )  gii   S1ii   S 2ii   S 5ii  i  i=1   i=1   i=1   i=1  i=1

6.2.7 Cross-validation of WOAQ

The WOAQ was initially validated using the nursing data and was then cross- validated on the paramedics data. Finally invariance was tested for males and females in the paramedics sample. At the first step of this invariance analysis, the baseline bifactor model was tested separately for males and females and at the second step the cross validity of the

WOAQ was assessed using invariance testing across gender.

104

Figure 6.1. A demonstration of Bifactor model of WOAQ with phantom variables for calculating omega coefficients on the right-hand side (Note that the phantom variable paths are constrained equal to 1 creating equally weighted composite scores).

105

6.3 Summary

This study has both theoretical and empirical implications. The WOAQ was originally developed and used in manufacturing settings. In addition, the previous studies used a higher order model of WOAQ and some reported poor fit for the scale. To the best of the researchers’ knowledge, no study has been conducted in a non-manufacturing, health setting in Australia using a bifactor modeling procedure. The present study used data collected from a group of Australian nurses and a group of paramedics to assess the validity, cross-validity and model-based reliability of WOAQ, a well-designed instrument for assessing work and organisational factors as potential risks to employee health. The main aim of the study was:

1) To assess the validity of WOAQ in an Australian health setting.

2) To compare a bifactor model (nested factor models) with a conventional higher

order model of WOAQ using Confirmatory Factor Analysis (CFA).

3) To assess and compare model-based reliability coefficients of Omega

hierarchical, Omega subscales and Omega total with the conventional

coefficient alpha.

4) To assess the cross-validity of the Work Organisations Assessment

Questionnaire (WOAQ) on a group of paramedics.

5) To assess the cross-validity of the Work Organisations Assessment

Questionnaire (WOAQ) on male and female paramedics

106

Unlike previous studies which used a higher-order Confirmatory Factor

Analysis (CFA) model for WOAQ, a bifactor modeling procedure was used in

this study. There is a very limited literature on the invariance testing of bifactor

models, making this a really novel research study.

107

7

STUDY 1: RESULTS

In this chapter, the study involving 312 nurses was used to validate the Bifactor

WOAQ model. This model is then fitted for a sample of 945 paramedics and a test of invariance is used to evaluate the cross-validity of this model for male and female paramedics.

7.1 Results: Study of Nurses-Validation of Bifactor Model of WOAQ

In this chapter, the bifactor model for WOAQ is validated for the sample of nurses

described in Chapter 6. Descriptive statistics are presented and then goodness of fit

statistics and reliability measures are derived using a higher order model and a bifactor

model. The results indicate that the bifactor model is a more valid representation of

WOAQ for this sample.

7.1.1 Descriptive Statistics for Demographics

Table 7.1 presents the frequencies, means and standard deviations for the demographic variables. The majority of the participants were female (94.5%) with an average age of 45.19. The majority had more than 4 years’ experience working in a nursing setting (97.1%). About 40% of the participants were working full-time with the remaining

60% part-time employees.

108

Table 7.1 Descriptive Statistics of the Demographic Variables*

Frequency (%) Gender Male 17 (5.5) Female 290 (94.5)

Contacts with < 2 22 (7.1) clients(hrs/workday) 2-4 30 (9.7) 4-6 122 (39.4) 6-8 134 (43.2) >8 2 (0.6)

Years of experience(years) < 1 2 (0.7) 1-3 7 (2.3) 4-6 16 (5.2) >6 282 (91.9)

Employment status Part-time 183 (60) Full-time 123 (40)

Mean Age (SD) 45.19 (9.54)

* n varies between 306-312 due to some missing responses

7.1.1.1 Descriptive Statistics at Item Level.

The twenty eight WOAQ items (Griffiths et al., 2006) are shown in Table 7.2, grouped according to the five subscales.

109

Table 7.2 Subscales and WOAQ Items

WOAQ Subscales WOAQ Items Quality of relationships with 3. Clear roles and responsibilities management 5. Support from supervisor 7. Feedback on your performance 11. Appreciation or recognition of your efforts by supervisors 16. Senior management attitudes 17. Clear reporting lines 22. Communication with supervisor 26. Status/recognition in the company – 27. Clear company objectives, values, procedures - Reward & recognition 12. Consultation about changes in your job - 13. Sufficient training for this job - 14. Amount of variety in the work you do – 21. Opportunities for promotion – 23. Opportunities for learning new skills - 24. Flexibility of working hours - 25. Opportunities to use your skills - Workload issues 6. Pace of work – 8. Your workload – 15. Impact of family/social life on work 19. Impact of work on family/social life Quality of relationships with 10. How you get on with your co-workers colleagues (personally/socially) 28. How well you work with your co-workers (as a team) Quality of physical 1. Facilities for taking breaks (places for breaks, meals) environment 2. Work surroundings (noise, light, temperature, etc.)

110

4. Exposure to physical danger 9. Health and safety at work 18. Equipment, tools, IT or software that you use 20. Work stations and work space

As shown in the next table, all the skewness and kurtosis coefficients were less than one in absolute value demonstrating behaviour reasonably close to normality at item level

(West, Finch, & Curran, 1995).

Table 7.3 Item Characteristics of WOAQ

Items Mean SD Skew Kurtosis WOAQ - quality of relationships with 3.43 1.03 -.33 -.48 management 3 3.60 1.02 -.38 -.49 5 3.60 1.21 -.55 -.71 7 3.15 1.04 -.11 -.55 11 3.29 1.11 -.22 -.85 16 3.12 1.15 -.09 -.78 17 3.55 .91 -.40 .09 22 3.49 1.05 -.41 -.45 26 3.39 .96 -.38 -.13 27 3.69 .88 -.48 .33 WOAQ - reward & recognition 3.37 .80 -.21 -.34 12 2.99 1.01 .01 -.50 13 3.52 .97 -.36 -.24 14 3.63 .83 -.18 -.16

111

21 3.06 .90 -.03 .11 23 3.63 .92 -.47 -.29 24 3.20 1.0 -.14 -.61 25 3.61 .89 -.28 -.39 WOAQ - workload issues 2.79 .98 .23 .64 6 2.79 1.16 .17 -1.0 8 2.68 1.0 .28 -.85 15 2.94 .83 .15 .58 19 2.75 .93 .34 .13 WOAQ - quality of relationships with 3.94 .83 .58 .47 colleagues 10 3.83 .82 -.31 -.23 28 4.06 .84 -.85 .72 WOAQ - quality of physical environment 2.97 1.07 .24 -.64 1 2.80 1.27 .26 -1.0 2 2.84 1.09 .27 -.61 4 3.00 .90 .62 .18 9 3.35 .99 .03 -.60 18 2.88 1.14 .21 -.99 20 3.00 1.04 -.05 -.49 Total 3.30 .94 -.31 -.51

7.1.1.2 Test of Model Assumptions

Although the normality assumptions were reasonably valid at item level, the multivariate distribution of the items also needs to be checked. In this study CFA is testing a multivariate statistical model using Maximum Likelihood (ML) estimation, assuming multivariate normality (Hoyle, 2000). Multivariate normality can be evaluated using

112

Mardia’s multivariate skewness and kurtosis tests (Bentler & Wu, 2002). Although the preliminary assessment at item level showed a relatively normal distribution for the data,

Mardia’s Multivariate Kurtosis coefficient (Mardia's ccoefficient (G2, P) = 109.40: normalised estimate = 23.57) is a little high, indicating violation of the multivariate normality assumptions. An upper limit of below 20 is usually required for Mardia’s

Multivariate Kurtosis coefficient (Byrne, 2010).

Non-parametric tests were therefore used for evaluating the model. As described in the literature (Hu, Bentler, & Kano, 1992; Curran, West, & Finch, 1996), the Satorra-

Bentler (1988, 1994) chi-square test should be used when the assumption of normality is violated. The scaled chi-square (χ2/df) and robust standard errors using ML estimation is a method suggested by Satorra and Bentler (1988; 1994). It appears to be a good general approach for dealing with departures from normality. As noted previously, ideally the scaled χ2 has a value of between 1 and 2.

7.1.2 Model fit evaluation

The dimensionality of the general score of WOAQ and its nested five subfactors were assessed using confirmatory factor analysis (CFA). Both the second-order model

(higher-order model) (Figure 7.1, model 1) and bifactor model were assessed (Figure 7.1, model 2). The results of modification indices suggested correlation between three sets of construct measurement errors, specifically for one of the environmental factor items (safety at work with exposure to physical danger) and two of the workload factor items (Impact of family/social life on work with Impact of your work on family/social life; pace of work

113

with workload). As suggested by Kenny (2011), if some items have similar content and are

theoretically meaningful, one may correlate the errors for these items.

Model 1. Higher order model of WOAQ Model 2: Bifactor model of WOAQ

Figure 7.1. The proposed bifactor model of WOAQ vs. higher order

114

The results indicated that the higher order model provides a marginally acceptable model for WOAQ and its five subfactors (SB Scaled χ2=2.14, RMSEA=0.06, NNFI=0.89).

The factor loadings for the subfactors suggest well-defined subfactors. In addition, the factor loadings of the five subscales over the higher order factor of WOAQ were strong and significant. The path coefficients were 0.71 for ‘quality of physical environment’, 0.57 for

‘quality of relationship with colleagues’, 0.93 for ‘quality of relationship with management’, 0.99 for ‘reward and recognition’ and 0.76 for ‘workload issues’.

For meaningful comparison of the higher order model with the bifactor model, the

Schmid-Leiman transformation was conducted to obtain loadings for all items on the higher order factor. Table 7.4 provides the Schmid–Leiman transformed factor loadings for the higher order factor. As suggested by Gignac (2007), the Schmid–Leiman (S-L) transformations were calculated by multiplying the first-order factor loadings with their respective second-order factor loadings.

The results of the bifactor model suggest an acceptable fit (SB Scaled χ2=1.71,

RMSEA =0.04, NNFI=0.93). Table 7.5 presents the results of the CFA evaluation. Based on these results, the bifactor model of WOAQ provides a superior fit with a smaller AIC value (AIC=-89.66) compared to the conventional higher order model (AIC=50.44). The

ΔNNFI is bigger than 0.04 and ΔAIC is -140.10 indicating significant superiority of the bifactor model over the higher order model.

Important differences were found in the factor loadings of the bifactor model compared to the higher order model. The most important difference was found for the

‘quality of relationship with management’. The S-L solution of the higher order model

115

showed positively defined fairly uniform factor loadings for this factor, while the bifactor model detected differentially directed loadings. In addition, in the bifactor model for the two subscales ‘the quality of relationship with management’ and ‘the reward and recognition’, items were highly loaded on the general WOAQ but poorly loaded on their nested group constructs. However, in the higher order model the items for ‘the reward and recognition’ subscale had low loadings in both cases.

116

Table 7.4 Completely Standardized Maximum Likelihood (ML) Solutions of Higher order Model and the Bifactor Model

Item number S-L Higher order Bifactor S-L QPE QRC QRM RR WI G QPE QRC QRM RR WI 1 0.47 0.55 .32 .60 2 0.54 0.62 .35 .78 3 0.57 0.23 .65 -.18 4 0.29 0.34 .24 .29 5 0.74 0.30 .72 .42 6 0.52 0.46 .47 .37 7 0.71 0.28 .73 .12 8 0.52 0.46 .48 .37 9 0.42 0.49 .44 .33 10 0.37 0.55 .35 .65 11 0.79 0.31 .78 .33 12 0.00 0.02 .74 -.11 13 0.00 0.02 .64 .06 14 0.00 0.02 .48 .43 15 0.43 0.38 .37 .58 16 0.73 0.29 .72 .29 17 0.72 0.28 .72 .18 18 0.34 0.39 .31 .28 19 0.48 0.42 .44 .52 20 0.54 0.62 .49 .52 21 0.57 0.02 .56 .06 22 0.78 0.31 .76 .40 23 0.74 0.02 .69 .30 24 0.60 0.02 .55 .09 25 0.69 0.02 .66 .44 26 0.73 0.29 .80 -.05 27 0.72 0.28 .79 -.11 28 0.45 0.67 .45 .55 Note: S-L= Schmid-Leiman Transformation of Item Loadings, G=General factor of WOAQ, QPE=Quality of physical environment, QRC=Quality of relationship with colleagues, QRM=Quality of relationship with management, RR=reward and recognition, WI=Workload issues.

117

118

Table 7.5 Summary of Model Fit Statistics of the CFA Models of WOAQ

Model SB RMSEA NNFI AIC ΔNNFI† ΔAIC†

χ2/df

0. Independent Model 11.48

1. Higher order model 2.14 0.06 (.05, 0.06) 0.89 50.44

2. Bifactor model 1.71 0.04 (0.04, 0.05) 0.93 -89.66 0.04 -140.10

Note: SB=Scaled χ2. RMSEA= Root Mean Square Error of Approximation; NNFI=Non- normed fit index; AIC = Akaike Information Criterion; †=differences model 2 - model 1.

7.1.3 Model-based reliability

Further analysis was carried out to assess the reliability of the well-fitting bifactor model of WOAQ. The results of the model-based reliability evaluation of the

multidimensional WOAQ using the ωt reliability coefficient (combined true score variance across the general factor of WOAQ and its five nested subfactors) indicated excellent total reliability for this scale (0.92). It seems that 92% of the WOAQ variance is true variance, leaving 8% for error.

The Omega hierarchical reliability coefficient demonstrates that the general factor of

WOAQ explains 87% of the variance while the total contribution of the subfactors is minimal in the presence of the general WOAQ. In other words, a substantial proportion of internal consistency belongs to the general factor of WOAQ rather than its nested five subfactors.

118

119

To better understand the individual reliability of each nested subfactor, Omega subscale reliability coefficient was calculated for each nested subfactor, controlling for the effects of the general factor of WOAQ. The results show that among the five nested subfactors,

‘physical environment’ (ωs =.52), ‘workload issues’ (ωs =.39), and ‘relationships with

colleagues’ (ωs =.36) demonstrated higher reliability than the other two nested subscales, independent of general WOAQ. The lowest omega subscale reliability coefficients belonged to ‘the quality of relationship with management’ and ‘reward and ‘recognition’, demonstrating more dependency on general WOAQ for these two subscales.

As expected, the conventional coefficient alpha reported an overestimation of the reliability (α =.94) which was probably caused by a violation of the unidimensionality and independent residuals assumptions.

Table 7.6 The Reliability Coefficients of WOAQ among Nursing Sample (n=312) Constructs α ω ω ω t h s

General WOAQ .94 .92 .87 -

Physical environment .51

Relationships with colleagues .35

Quality of relationships with management .16

Reward & Recognition .15

Workload issues .39

This bifactor model was then fitted to the paramedics data.

119

120

7.2 Results: Study of Paramedics-Cross Validation of Bifactor Model WOAQ

After establishing the validity of the bifactor model of WOAQ in one health population, it is important to cross validate the model in a different health population.

Therefore, using a different sample (here paramedics) the invariance evaluation of WOAQ bifactor was assessed, first for the combined sample and then across gender. The demographic characteristics of the total sample as well as the male and female samples are presented at

Table 7.7.

7.2.1 Descriptive Statistics for Demographics

As indicated in Table 7.7 this was a much larger sample with a much better representation of males than the nursing sample. Also, there were very few part-time employees compared to the nursing sample and average age was younger than for the nursing sample. These results suggest therefore that this is a very different population, making it appropriate for this sample to be used for the cross-validation of the bifactor model for

WOAQ in a health setting.

120

121

Table 7.7 Characteristics of Paramedic Participants

Total n=945 Males n=623 Females n=322 Frequency (%) Gender Male 623 (65.9) - - Female 322 (34.1) Employment status Ϯ Part-time 895 (94.7) 610 (97.91) 287 (89.13) Full-time 48 (5.1) 13 (2.09) 35 (10.87) Years of experience < 1 year 92 (9.8) 38 (6.1) 54 (16.9) 1-3 years 127 (13.5) 55 (8.9) 72 (22.5) 4-6 years 133 (14.1) 62 (10.0) 71 (22.2) > 6 years 588 (62.6) 465 (75.0) 123 (38.4) Age (years) Mean (Range) 40.15 (21-65) 43.72 (22-65) 33.24 (21-56)

Note: Ϯ Due to their very low percentage (1.3 per cent in the paramedic organisation), casual employees have been allocated to the part-time category.

At the first step, a baseline bifactor model of WOAQ that was evaluated in the previous section, was assessed separately for both male and female paramedic groups. The results in Table 7.8 show adequate model fit for the baseline bifactor model for males

(RMSEA =0.04, NNFI=0.94), and females (RMSEA =0.05, NNFI=0.92).

Table 7.8 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Gender Model SB χ2/df CFI RMSEA NNFI

Baseline Model: Male 2.47 0.94 0.04 0.94 Female 1.87 0.93 0.05 0.92 Note: RMSEA= Root Mean Square Error of Approximation; NNFI=Non-normed fit index.

121

122

Model 1: Configural model with no constraints. At the next step, a configural bifactor model was fitted for the male and female groups simultaneously to determine if the model is appropriate when there are no constraints. Based on the results, the configural model

(RMSEA =0.03, NNFI=0.97) showed good model fit (Table 7.9). This model shows that the bifactor model is appropriate for paramedics as well as nursing staff, suggesting that this model may have general applicability in health.

Model 2: Invariant loadings. After constraining the loadings to be equal for males and females, the results still showed good fit (RMSEA=0.03, NNFI=0.96). To test for evidence of invariance, the differences between the NNFI and AIC of Model 2 and Model 1 were considered. This suggests no significant deterioration in model fit for constrained loadings compared to the configural model (unconstrained loadings) in the case of NNFI (table 7.9); but there was an increase of more than 10 in the AIC. In the circumstances it was unclear if invariance could be claimed across gender.

As previously explained, reaching full invariance for all the parameters, or even the most important ones, is very rare in most models (e.g. Byrne, Shavelson & Muthen, 1989). In view of the conflicting results obtained above, a decision was therefore made to proceed to the next stage of the invariance analysis considering the differences in construct means for males and females.

122

123

Table 7.9 Invariance Testing Across Gender for the Bifactor Model of WOAQ.

Models SB χ2/df CFI RMSEA NNFI AIC Δ*NNFI Δ*AIC

Model 1 - Configural model-no constraints 1.49 0.97 0.03 0.97 -315.84 - -

Model 2 – M1+loadings invariance 1.60 0.96 0.03 0.96 -274.65 0.007 -41.19

Note: RMSEA= Root Mean Square Error of Approximation; NNFI=Non-normed fit index; AIC = Akaike Information Criterion; Δ=change

123

124

The Group Differences for Construct Means. Gender differences in the mean values for the general factor of WOAQ and the mean values for the nested five factor were considered with the female group means selected as the reference level. The construct means for the female group were therefore set to zero while the construct means associated with the male group were estimated, providing an estimate for the mean differences between groups for the constructs.

After setting equality constraints on loadings, and intercepts for the measured variables, with factor intercepts of zero for female employees, the results showed a marginal fit for the model (RMSEA=0.05, NNFI=0.917). The mean differences between the male and female groups were significant on two of the nested constructs (‘co-worker’ and ‘reward- recognition’) and also for the general factor of WOAQ. The Z score results showed that the mean scores on these two nested constructs and the general factor of WOAQ are significantly higher for male employees than for female employees. In the next chapter the implications of these results are discussed.

124

8

STUDY 1: DISCUSSION

In this chapter we start by considering the previous WOAQ bifactor model results obtained for the nursing sample. We then consider this model in the case of the paramedics sample and, in particular, we probe the implications of the gender differences that were exposed.

8.1 Discussion: Study of Nurses-Validation of Bifactor Model of WOAQ

The most common problems detected in the literature on full risk assessment relate to the length of questionnaires. These are either very long or detailed or they are unable to detect the hazardous nature of any identified problems in a work setting. In response to the evidenced need for a short, valid s risk assessment, the WOAQ was developed.

Work Organisation Assessment Questionnaire (WOAQ) (Griffiths, et al., 2006) seems to overcome these problems with its short length (28 items) and yet comprehensive content.

The WOAQ seeks to identify and collect employees’ opinions on their work and health

(Griffiths, Cox, Karanika, Khan, & Toma, 2006). The WOAQ was originally developed for a manufacturing setting but it is widely used in non-manufacturing settings without having been properly validated in these new settings.

The present research examines the validity and model-based reliability of WOAQ for a group of Australian employees using the conventional higher order model and a bifactor model using CFA. The WOAQ higher order model included a second-order factor and five first order subfactors, each representing different dimensions of work organisation risk

125

assessment. The five subfactors are: ‘quality of relationships with management’, ‘reward and recognition’, ‘workload issues’, ‘quality of relationships with colleagues’, and ‘quality of physical environment’. The bifactor model of WOAQ included a general measure of WOAQ and the above five subfactors.

Previous studies, using higher order modeling of WOAQ, failed to validate the model, reporting a poor fit (e.g. Waynne-Jones, 2009). The present study therefore considered a bifactor model of WOAQ, and compared this model with the conventional higher order model of WOAQ. Based on previous studies, bifactor models in general demonstrate superior fit over the higher order models (e.g. Gignac, 2007, 2014; Reise, Bonifay, Haviland, 2012;

Reise, 2012). In spite of their importance in the context of Organisational studies, evaluation of bifactor models such as WOAQ is something quite new for the organisational psychology discipline. While a conventional model of WOAQ provides an indirect relationship between the higher order construct and the items, a bifactor model of WOAQ provides a full first order multidimensional model, where both the general factor of WOAQ and its nested subfactors are evaluated with a direct relationship to the WOAQ items. In addition, using model-based reliability coefficients Omega, more valuable information can be obtained about the internal consistency of each construct. The results of this study revealed the superiority of the bifactor model over the conventional higher order model. In addition, very important differences were found between the higher order model and the bifactor model. The most important difference was detected when the conventional higher order model failed to recognise the low and differentially directed loadings of the ‘quality of relationship with management’ items. The results of the bifactor model showed that the subfactor for ‘quality of relationship with management’ was poorly defined independently of the general measure of WOAQ. The

126

subfactor of ‘reward and recognition’ was found to be implausible in both models. Given the fact that high correlations were observed between these two subscales and their high dependency on the general factor of WOAQ, the reliability of these two subscales clearly rely more on the variation in the general factor of WOAQ than on any sub-factor.

These results have important practical implications. They show that in the context of community nursing, although the general measure of WOAQ is a valid and reliable measure for organisational risk assessment, the most important plausible subscales are ‘quality of physical environment’, ‘the workload issues’ and ‘quality of relationship with the colleagues’ respectively. Based on the findings, focusing on the two subscales of ‘the quality of relationship with management’ and ‘reward and recognition’ without considering the general

WOAQ indicators will be unlikely to lead to any significant improvements. In contrast, the other three sub-constructs, especially ‘the quality of work environment’, seem to have significant unique reliability, independent of the general factor of WOAQ. In practice, this means that any intervention to improve only the work environment would still have significant effects on the level of perceived risks in workplaces.

Unfortunately, lack of previous studies makes it difficult to compare the findings in other health areas. The majority of previous studies on WOAQ have been conducted in manufacturing settings using the conventional higher order model procedure (e.g. Griffiths, et al, 2006; Waynne-Jones, 2009). However, close evaluation of the work setting of nurses, indicates that these findings should not be much of a surprise. These findings fit with the nature of the community-nursing work environment. The reason behind this is that although the nurses belong to a large organisation, they work in different, small branches with their own immediate managers/supervisors. In such an environment, there is a more informal

127

relationship between the nurse and manager/supervisor. The relationships in community nursing settings are more colleague-colleague relationships rather than nurse-manager relationships so it should be expected that ‘the quality of relationship with management’ would be unimportant. Also ‘the reward and recognition’ factor is strongly tied to the management relationship and only items representing a variety of tasks, opportunity for learning and using the new skills appeared as important indicators of this subscale. Thus, in practice, if an organisation is wanting to make risk management improvements, the main plausible subfactors to look at are the work physical environment, the relationship with colleagues and managing the workloads.

The bifactor model of WOAQ could also have some critical cost and efficacy implications in the workplace. For example, consider the situation where there are limited budgets or resources to be allocated to improve the overall quality of the work organisation, or, if it is not feasible or realistic to change all the subconstructs of risk in the workplace simultaneously. Using a bifactor model one can separate the specific effects of each subfactor from the general factor of WOAQ and determine the most plausible construct for an immediate, more feasible intervention. In more costly or complicated situations, the practitioners or policy makers could take advantages of such bifactor modeling to determine the most plausible sub-constructs for achieving improvement in the short-term.

Unsurprisingly, the results also indicated that the conventional reliability coefficient of alpha was overestimated (though slightly) compared to the omega total and omega hierarchical coefficients. This is consistent with results of previous studies in other disciplines

(Gignac, 2007, 2014; Reise, Bonifay, Haviland, 2012). These scholars indicated that in models violating the assumptions required for a reliable alpha, the coefficient will often be

128

overestimated. Therefore, it is highly recommended that in any future studies and especially for complicated multidimensional models, that scholars should by default use model-based reliability coefficients. Although this is deemed to be more critical in clinical or health studies, overall it is important for scholars of all disciplines to use these more accurate reliability assessments in order to avoid serious errors.

8.2 Discussion: Study of Paramedics-Cross Validation of Bifactor Model of WOAQ

The cross validation results obtained using the paramedics sample indicate that the bifactor WOAQ model is a valid tool in another very different health setting. It appears that using a bifactor model presents not only a better fit than a higher order model, but also highlights the importance of the subscales relative to a single general factor in a health setting.

Moreover the results suggest that this model can be used with both male and female paramedics. However, although these results demonstrate good validity for the WOAQ across gender, the mean differences between males and females were found to be significant. The results showed that the scores on two of the five nested constructs and the general factor of

WOAQ were significantly higher for male employees compared to female employees, demonstrating that male employees are happier with the ‘quality of relationships with co- workers’ and the system of ‘reward and recognition’, as well as the general quality of WOAQ than female employees in this parmedics organisation. The results have important implications in practice, and specifically in relation to the occupational health and safety of paramedics, particularly female paramedics.

Overall, the WOAQ, especially the general WOAQ measure, appears to be a superior instrument for assessing risk factors associated with employees’ health and health-related behaviour, due to its satisfactory psychometric properties and short length. More importantly,

129

based on the results from the bifactor analysis, it was shown that some of the subscales are more important than others in a health setting. This indicated that concerns relating to the importance of an identified problem in the work setting can be solved by fitting the bifactor

WOAQ model to assess the importance of various risk factors in the workplace. Ultimately, this will assist management in identifying problem areas which may cause harm to their employees and the organisation, and thus allow proper action to be undertaken in order to improve the work environment.

The WOAQ tool can be especially useful when access to specialist occupational health support is limited. This is because WOAQ is short and easy to use and, in more practical terms, can directly inform workplace interventions to improve employee health and well- being. Furthermore, by directly informing the development of targeted workplace interventions to improve the psychosocial factors and work conditions for paramedics, the

WOAQ offers a potential to help avoid the structural labour force shortages experienced in this area, especially in the developed world.

8.3 Strengths and Limitations

The strength of the study is the context of the research and the methodology used.

This can be further elaborated as 5 key points.

Firstly, to the best of the researchers’ knowledge, this study is one of the first that is comparing a conventional higher order model with a bifactor model of WOAQ in a health setting. The methodology used also has theoretical and practical implications in other organisational studies. The conventional higher order modeling is based on full mediation of item effects by first-order sub-constructs. In practice and real life situations, and especially in

130

organisational studies such as WOAQ, this has limited applicability. Bifactor modeling assumes partial mediation which is much closer to the reality.

Depending on their nature of work (e.g. manufacturing vs. non-manufacturing) and occupation types, organisations will have significant differences in regard to WOAQ. A risk assessment tool like WOAQ is a very useful tool for assessing the organisational risk factors.

However in practice, not all of the WOAQ subfactors are plausible or important, as was found in this study. Therefore, in the work setting, bifactor modelling of WOAQ is deemed to be more appropriate as the results relate well to real life expectations.

The 2nd key point is that this study has considered only the most suitable fit indices, based on the degree of penalty included for model complexity. These indices (i.e. RMSEA,

NNFI/TLI, & AIC) and differences between these indices have been demonstrated in this empirical study for interpreting the complex model of WOAQ.

The 3rd key point is that the study has used model-based reliability coefficients.

Taking into account the multidimensional nature of WOAQ, i.e. both the general and the five- factor model of WOAQ, omega reliability coefficients have been used to assess measurement reliability. Using omega model-based reliability measure rather than the conventional coefficient alpha is recommended for multidimensional models such as the WOAQ.

The 4th key point is that this is one of the first studies that has been conducted for a group of Australian employees in a health setting as opposed to a manufacturing setting for which the original WOAQ was developed.. No previous studies have been completed in a health setting using a comprehensive, short scale of risk assessment similar to the WOAQ.

This study therefore initiates a critical avenue for more research in WOAQ.

131

As the final key point, the WOAQ is a useful tool in practice because of its ability to provide organisational risk assessment using only 28 items. This meets workplace requirements in terms of cost, time and resources. Using bifactor modeling the most plausible subfactors were identified for improving the organisational risk environment in a health setting.

One of the limitations in this study is that it focuses only on health professionals.

Further studies are needed to expand the concept to other non-manufacturing or ‘blue collar’ occupations.

In spite of the importance of the omega reliability coefficient, still there is no detailed guideline on the cutoff points for interpreting omega for general scales and for subscales.

Reise et al. (2012) suggested a minimum cutoff point of greater than .50, this is not backed up by any significant evidence yet. Further studies are needed to shed more light on this.

The lack of background literature in an Australian context for the use of bifactor modeling of WOAQ makes it difficult to evaluate or compare the results with other studies.

Further studies are needed to fill this gap.

8.4 Summary and Conclusion

In this study, attempts were made to assess the validity and reliability of WOAQ in an

Australian health setting, using robust methodological procedures. Based on the literature, several robust procedures were adopted for assessing the validity of WOAQ, including a comparison of the conventional higher order of WOAQ with a bifactor model of WOAQ and the testing of model-based reliability.

132

In general, results showed that the WOAQ appears to be a superior instrument for assessing risk factors associated with employees’ health and health-related behaviour due to its satisfactory psychometric properties and short length. Although the general factor of

WOAQ seems to be the dominant factor, some evidence of multidimensionality was found and some subfactors appeared to play more critical roles in risk assessment in a nursing setting. The cross validity of the scale on a paramedic sample was demonstrated when these results were replicated in another very different health setting. However, interesting differences in mean values for male and female paramedics indicated that this was a gender- sensitive assessment tool.

In conclusion, this study adds to the evidence supporting the feasibility of the WOAQ for both research and practice in a range of settings. However, future research should continue to validate the WOAQ with other occupational groups and sectors using a bifactor model.

133

9

STUDY 2: APPLICATIONS OF COVARIATE-DEPENDENT RELIABILITY

The purpose of chapters 9 to 11 is to empirically demonstrate the Bentler’s 2014 approach of covariate-dependent reliability. There are two main proposed applications; the first application considers the effects of potential covariates on scale reliability, the second demonstrates the effects of Common Method Bias (CMB) on scale reliability. Different data sets were used for these two applications. The WOAQ data was used for the first application.

For the second application, a student study of social desirability, emotional intelligence, wellbeing and alcohol drinking behaviour was used. These applications are described in this chapter but the actual analyses are left until Chapter 10 with the discussion following in

Chapter 11.

9.1 Rational and Objectives

9.1.1 Application of Covariate-dependent Reliability in Reliability Assessments

In 2012 (personal conversations), Bentler introduced the concept of covariate- dependent and covariate-free reliability that partitions total reliability into two parts. The first part relates to external covariates and the second part being unaffected by such covariates

(covariate-free reliability). The approach was officially presented in 2014 (Bentler, 2014).

The following material on covariate-dependent reliability was adapted from either personal conversations with Bentler (2012, 2013) or Bentler (2014). Only the practical application of this concept was assessed in this study, using the data previously described in Chapter 7 and a

134

second student data set relating to social desirability, emotional intelligence, wellbeing and alcohol drinking behaviour.

9.1.1.1 First Application of Covariate-dependent Reliability.

Based on the above development, covariate-dependent reliability and covariate-free reliability can be evaluated for the bifactor model of WOAQ, using the nursing and paramedic group variables as a covariate. Although the model-based reliability of WOAQ has been found to be acceptable in the nursing and paramedic organisations (within organisation assessment), an evaluation of reliability across organisations has yet to be established. It is hypothesised that although both organisations are health related, due to differences in the nature of work and different demographic characteristics of the paramedics and home-based nursing organisations, the type of organisation will affect the reliability of the WOAQ. Hence, the home-based nursing organisation and the paramedic organisation must be compared in a reliability assessment of the bifactor model as illustrated in Figure 9.1.

135

Figure 9.1. Covariate-dependent reliability assessment with the bifactor model of WOAQ across the nursing and paramedic organisations.

136

Both nursing and paramedic occupations can be categorised as providing clinical care, however the nature and demands of these two occupations are very different. A clinician in the home-based nursing service provides services over a period of time that is defined by a client’s needs. While these clients may have acute clinical needs, they are generally medically stable, and often have been discharged from hospital as they no longer require the acute clinical care provided in the hospital setting. In contrast, paramedical practitioners are called to respond quickly to clients in need of urgent medical care. The paramedics have short, intensive interactions with their clients who are often acutely ill. Clearly, the demands and expectations are different for each of these professions, and consequently for the organisations in which they work. While both professional groups complete most of their work away from their formal organisational settings, the nature of their interactions with their clients is fundamentally different. Typically, the nurses interact with the clients in their own homes while paramedics work with patients in a wide variety of settings where urgent medical response is required. The nurses have the opportunity to ‘get to know’ the clients and interact with them over time, while the paramedics normally interact for only a single short term episode, during which the clients may not even be responsive.

In addition to the different nature of work and wokplace demands, both organisations have different demographic characteristics. For example in this study, the majority of home- based nurses are female while the majority of paramedics are males. Also in comparison to the community nursing organisation, the paramedics’ organisation has more part-time workers and a significantly lower average age of workers.

When there is a group covariate, such as organisation type, that affects a latent factor

(WOAQ in this case), the question is whether there are mean differences in the latent factor as

137

a function of the group covariate. As mentioned previously, covariate-dependent reliability is a measure of the group differences in the trait being measured relative to total variation

(Equation 3.12). Covariate-free reliability is a measure of the individual differences relative to total variation, freed from any mean differences due to the covariate(s) (Equation 3.11). None of the Omega reliability coefficients based on the WOAQ bifactor model introduced previously have partitioned the variance into its covariate-dependent and covariate-free parts.

Based on the above information, it is rational to argue that due to differences in the nature of the work and the demographic characteristics of paramedic and home-based nursing organisations, the type of organisation will influence the reliability of work organisation assessments such as the Work Organisation Assessment Questionnaire (WOAQ), that measure psychosocial/physical aspects of an organisation. As a result, it was hypothesised that:

Hypothesis 9.1: The type of organisation (home-based nursing vs. paramedic) will be

one of the possible covariates affecting model-based reliability coefficients of the WOAQ.

Method. The data used for Study 1 (home-based nursing and paramedics) were used to demonstrate the application of a covariate-dependent (here organisation-dependent) reliability assessment of WOAQ.

The procedure proposed by Bentler (2014), and fully discussed in Chapter 3, was used to calculate the covariate-dependent and covariate free coefficients of WOAQ in this study.

This procedure is only available in EQS, and only for higher order models. The calculation for bifactor models is not implemented in EQS yet, therefore all the calculations for the bifactor model were conducted manually.

138

9.1.2 Second Application of Covariate-dependent Reliability for Demonstrating CMB

There is a general belief among scholars that measurement error is a source of many problems in research (Podsakoff, MacKenzie, Lee, & Podsakoff, 2003). Measurement error has the capacity to misrepresent and confound the empirical findings of research, causing erroneous conclusions to be drawn (Bagozzi & Yi, 1991). This issue becomes more salient when researchers rely on a single source of data collection and self-report measures (Glick,

Jenkins, & Gupta, 1986; Meade, Watson, & Kroustalis, 2007). The widespread use of single- source data collection tools is a potential concern in relation to common method variance

(CMV), which has been of interest to psychology since the 1950s.

One of the more popular and most convenient procedures for collecting data in psychology is the self-rated questionnaire (Malhotra, Kim, & Patil, 2006). It is also a common practice in psychology that data is gathered from a single self-rated questionnaire (Avolio,

Yammarino, & Bass, 1991). As a result, CMV appears to be a common problem in psychological studies (Malhotra, et al., 2006). Yet, despite its long history in the field of psychology, there seems to be a gap in the literature on CMV. The literature has paid more devotion to post-hoc statistical remedies, while the causes of the bias have been neglected.

Although, only a few researchers have investigated the consequences of CMV for measurement models, only a limited number of studies have been conducted which try to determine the effects of CMV on reliability (Williams et al., 2010). The goal of the present study was to address the current CMV gap in the literature, using the newly developed procedure termed ‘covariate-dependent and covariate-free reliability’ (Bentler, 2014). Using this procedure, CMV was introduced as a covariate for the study scales. If any covariate-

139

dependence is identified among the study scales, then we can conclude that CMV exists and needs to be controlled for in the data analysis.

Acquiescence is a potential source of the common method bias (CMB) that results from self-report surveys (Spector, 2006). According to Winkler, Kanouse and Ware (1982), acquiescence response sets refer to the propensity of the respondent to indicate agreement with items on the questionnaire independent of content.

Results from self-report measures are also susceptible to social desirability bias. Social desirability bias describes the inclination of the respondent to complete the questionnaire in a way which enables them to be presented in a positive light and to be in line with the norms and standards defined by their culture (Donaldson & Grant-Vallone, 2002; Ganster,

Hennessey, & Luthans, 1983; Podsakoff & Organ, 1986). Their responses to questions are usually determined by the level of social desirability (Schriesheim, Kinicki, & Schriesheim,

1979) inherent in the items of a questionnaire. This form of bias usually serves to hide the true bivariate relationships between variables and interferes with the interpretation of average tendencies as well as individual differences (Ganster, et al., 1983; Podsakoff, et al., 2003).

A preventive technique for detecting and controlling CMB can be used when the assumed cause of the method bias is known to the researcher and can be identified and measured. For example, this is commonly the case for social desirability. This preventive technique involves the inclusion of the CMB measure as a covariate with the study variables.

This also allows the effects of the surrogate measure for CMB (e.g. a social desirability scale) on the reliability of the study measures to be assessed.

140

9.1.3 The Effects of CMB and CMV on reliability of measures

CMB and CMV may affect the validity of a study (Doty & Glick, 1998). CMB and

CMV have the ability to confound the true relationship between variables, resulting in a bias between the observed and the true relationships by either inflating or deflating the estimates

(Doty & Glick, 1998). Although assessing the presence and quality of CMB provides important information on the effects on the parameter estimates, it can also be used to demonstrate the effects on scale reliability. Williams et al. (2010) proposed that the estimation of the reliability should be achieved by evaluating the decomposition of the overall reliability, both with and without a “marker variable” to reflect CMB. The reliability decomposition

(composite reliability) formula was originally proposed by Werts, Linn, and Joreskog (1974).

F Overall reliability = Equation 9.1 FE+

[F = The sum of factor loadings, squared; E = Sum of error variances]

The composite reliability in this instance is equal to coefficient rho for a single factor model and Omega total (Equation 3.9). For the present study, Bentler’s (2014) approach was used to compare the reliability of models with and without the presence of a CMB marker.

The procedure is simple and easy to calculate using EQS (see Chapter 3 for full details on this procedure). Note that, as explained earlier, when rho is used to represent the reliability of a multi-dimensional model it is quantifying the proportion of variance due to the most reliable single dimension in this multidimensional space. Using this procedure, one can obtain estimates of the CMB-dependent reliability, the CMB-free reliability, and the total reliability in one calculation.

141

This method was used to assess the influence of social desirability on the reliability of the constructs used in a student study of emotional intelligence, wellbeing and alcohol drinking behaviour. It was hypothesised that:

Hypothesis 9.2: A covariate-dependent reliability assessment, using social desirability as a potential source of bias, will demonstrate the effects of CMB on the reliability of the constructs in the study (Figure 9.2).

The main constructs in the data described below contained sensitive questions

(emotional intelligence, wellbeing, and alcohol drinking habits), sometimes prompting participants to demonstrate socially acceptable responses rather than presenting their true opinions. Socially desirable responses could therefore lead to bias for the estimated relationships between the constructs of the study (Ganster, Hennessey & Luthans, 1983).

Based on the literature, if factor or item responses are highly correlated with social desirability, then social desirability could be a potential source of bias that needs to be controlled (Podsakoff et al., 2003; Thomas & Kilmann, 1975).

However, the above model can be adapted to control for CMV due to a single survey source as well as CMB, using SEM procedures. By integrating both an unmeasured latent variable for CMV and a directly measured latent method factor representing CMB into the

SEM model, CMV and CMB can be evaluated simultaneously. In the context of the above student survey, this method can be used to test for rater bias, including social desirability, as well as a single source survey bias. It was therefore hypothesised that:

142

Hypothesis 9.3: A SEM integrated approach, including an unmeasured latent common method factor and a directly measured method factor (social desirability), can be used to evaluate the presence of CMV in the above context.

9.2 Method

The data for this study were collected from a group of undergraduate students in one faculty of the participating university. Participant groups were randomly selected, using a list of all the active subjects in the faculty for Semester Two in 2011. Upon receiving the lecturer’s consent, a questionnaire package that included a cover letter, information sheet, consent form, and a questionnaire, was provided to each student during their lecture break.

The information sheet provided assurances that all participant information would remain confidential. Upon completion of the survey, students were asked to place their questionnaires in the locked box provided in the classroom. After discarding the incomplete surveys, the final number of surveys included in the analysis was 341.

9.2.1 Measures. The questionnaire contained questions relating to wellbeing, alcohol drinking behaviour, emotional intelligence, social desirability and demographics.

Emotional Intelligence. Emotional intelligence was measured using the 33-item Self-

Report Emotional Intelligence Test (SREIT) (Schutte, Malouff, Hall, Haggerty, Cooper,

Golden, & Dornheim, 1998). On a five-point Likert scale, respondents were asked to self- report their preferences on a scale from 1 (strongly agree) to 5 (strongly disagree). The reliability and validity evidence for this scale has been positively assessed in previous studies

(e.g., Schutte & Malouff, 1999; Abraham, 1999; Ciarrochi, Chan, & Caputi, 2000; Petrides &

Furnham, 2000).

143

General Wellbeing. General wellbeing was tested using the General Wellbeing

Questionnaire (GWBQ) (Cox, Thirlaway, Gotts, & Cox, 1983). The GWBQ is a 24-item instrument used to measure sub-optimal health, using self-reported symptoms of general malaise. It includes a set of general non-specific symptoms of ill-health, including reportable aspects of cognitive, emotional, behavioural, and physiological function, none of which are clinically significant in themselves. The GWBQ consists of two 12-item subscales of sub- optimum health: (a) worn-out/exhausted and (b) tense/nervous. Respondents were asked to indicate how often they had experienced the listed 24 symptoms within the previous six months on a scale from 0 (never) to 4 (all the time).

Social Desirability. The 16-item Social Desirability Scale (SDS-16) (Stöber, 2001) was used to measure the social desirability of the respondents. The scale is presented with six reverse-keyed items. The original scale has 17 items, but the item “I have tried illegal drugs”

(e.g., marijuana, cocaine, etc.) was excluded because it is not suitable for the measurement of social desirability (Stöber, 2001). The items were parcelled into three scales in order to achieve SEM model identification.

Alcohol Drinking Behaviour Screening. The World Health Organisation’s Alcohol

Use Disorders Identification Test (AUDIT) is a tool used for screening alcohol drinking behaviour. AUDIT was originally developed by Saunders, Aasland, Babor, de la Fuente and

Grant in 1993 and has been validated extensively across different populations. It consists of three items on alcohol consumption, three on drinking behaviour and dependency, and four on the consequences or problems related to drinking. The items were parcelled into three items to achieve model identification.

144

9.2.2 Overview of analysis. Confirmatory factor analysis (CFA) was conducted to evaluate the proposed models. EQS 6.2 (built 100) and standard-fit indices (CMIN/DF, CFI, NNFI, and RMSEA) were used to evaluate the model fit. For reliability assessment and comparison, coefficient Omega, and covariate-dependent and covariate-free reliability coefficients were calculated as described in chapter 3.

To evaluate hypothesis 9.3, both constrained (equal-method factor loadings) and unconstrained (free-method factor loadings) models were assessed to find out if CMV exists and whether it has equal effects on the constructs of the study. Recently, a partial correlation technique was introduced by Lindell and Whitney (2001) that can be used to test for CMV. In this procedure, a ‘marker variable’ representing CMV was included in the analysis. Using a partial correlation procedure, the association between the marker variable and any construct in the model is used as an estimate of CMV. This allows all correlations among the constructs of the study to be corrected for CMV using a partial correlation adjustment (Williams, Hartman,

& Cavazotte, 2010). This method is called the correlational marker technique. Building on the partial correlation procedure of Lindell and Whitney (2001), further development has been carried out by Richardson et al. (2009) and Williams, Hartman, & Cavazotte (2010) using a structural equation modelling procedure for capturing and adjusting for CMV. This marker variable procedure using CFA was employed to evaluate hypothesis 9.3.

In SEM by default, ML is used for parameter estimation. When the sample size is large and data is normally distributed, ML provides the most accurate estimation with the smallest standard errors (Bentler, 2006). However, ML is sensitive to departures from normality. Therefore, assessment of normality is an essential requirement when using this procedure. Although the preliminary assessment of the data showed a relatively normal

145

distribution for the data, Mardia’s normalised coefficient was high - (G2, P) = 216.79, indicating a violation of normality assumptions. Outliers were detected in a further analysis, however the deletion of these observations did not result in a significant improvement in the fit indices. As a result, all cases were kept and a suitable, non-parametric test was used to evaluate the model. The Satorra-Bentler (1988, 1994) chi-square test delivers a more accurate assessment of model fit when the data does not have a normal distribution.

146

10

STUDY 2: RESULTS

In this chapter the two applications described in the previous chapter are applied. The application relating to reliability assessment is illustrated using the WOAQ data and the application relating to CMB (in the form of social desirability) and CMV (in the form of a single survey source), is illustrated using the student data for emotional intelligence, well- being, alcohol drinking behaviour and social desirability.

10.1 Results of Application for Reliability Assessments – The study of WOAQ

Because hypothesis 9.1 states that the type of organisation will affect the reliability of the WOAQ, organisation was added to the validated bifactor model of the WOAQ as a covariate, allowing the evaluation of the effect of organisation on the reliability of the WOAQ assessment tool (see Figure 9.1).

10.1.1 Descriptive statistics at item level

At the first stage, the validity of the model was assessed before proceeding with the reliability assessment. As shown in Table 10.1, the data at item level is relatively normal. All skewness and kurtosis coefficients were less than two and seven, demonstrating reasonable normality at item level (West, Finch, & Curran, 1995).

10.1.2 Descriptive statistics at group level

As discussed before, the multivariate normality in EQS can be evaluated using

Mardia’s multivariate skewness and kurtosis tests (Bentler & Wu, 2002). Because Mardia’s

147

coefficient (G2, P) = 116.64: normalised estimate = 50.44) indicated violation of the normality assumptions, non-parametric tests were used to evaluate the model.

Table 10.1 The Descriptive Characteristics of the Main Study Constructs and Parameters (n=1255)

Constructs Mean SD Skewness Kurtosis WOAQ - quality of relationships with 2.72 .90 .27 -.52 management 3 3.18 1.04 -.09 -.57 5 3.10 1.27 -.06 -1.09 7 2.63 1.13 .20 -.76 11 2.42 1.21 .44 -.86 16 2.15 1.19 .79 -.41 17 2.85 1.06 .08 -.60 22 3.06 1.13 -.03 -.79 26 2.52 1.12 .18 -.89 27 2.65 1.13 .19 -.75 WOAQ - reward & recognition 2.60 .82 .41 -.41 12 2.23 1.09 .54 -.58 13 2.62 1.18 .29 -.90 14 3.32 .97 -.17 -.26 21 2.35 1.02 .29 -.60 23 2.63 1.18 .26 -.93 24 2.08 1.18 .82 - .38 25 2.96 1.07 -.07 -.62 WOAQ - workload issues 2.65 .90 .09 -.49 6 2.47 1.14 .38 -.82 8 2.30 1.08 .46 -.70 15 2.56 .79 .48 1.69 19 2.90 1.12 .00 -1.2 WOAQ - quality of relationships with 3.92 .85 -.46 -.49 colleagues 10 3.93 .88 -.42 -.54 28 3.92 .94 -.60 -.25 WOAQ - quality of physical 2.54 .81 .45 -.16 environment 1 2.60 1.26 .39 -.94 2 2.73 1.12 .31 -.69 4 2.46 .97 .75 .40 9 2.55 1.10 .47 -.54 18 2.42 1.05 .56 -.37 20 2.49 1.08 .38 -.53 Total 2.73 .71 .32 -.29

148

Table 10.2 summarises the characteristics of both the nursing and paramedic organisations showing some demographic differences between the organisations. In terms of gender, there was a greater percentage of females in the nursing organisation (94.5%), while there were more males in the paramedic group (65.9%). The majority of the paramedics were part-time employees (94.7%), had more than six years of experiences (62.6%), and their average age was lower than that of the nurses (40 vs. 45 years old).

Table 10.2 Nursing and Paramedic Demographic Characteristics Nursing Paramedics

n=312* n=945

Frequency (%) Frequency (%)

Gender Male 17 (5.5) 623 (65.9) Female 290 (94.5) 322 (34.1)

Employment status† Part-time 183 (60.0) 895 (94.7)

Full-time 123 (40.0) 48 (5.1)

Years of experience/years < 1 year 2 (.7) 92 (9.8) 1-3 years 7 (2.3) 127 (13.5) 4-6 years 16 (5.2) 133 (14.1) > 6 years 282 (91.9) 588 (62.6)

Age Mean (Range) 45 (22-77) 40 (21-65)

Note. * Due to some missing data, n varies between 306 and 312. † Due to their very low percentage (7% in the nursing organisation and 1.3% in the paramedic organisation), casual employees have been allocated to part-time categories.

Further analysis was conducted to see if the demographic differences between organisations were statistically significant. Chi-Squared tests of association were carried out to compare gender ratios, employment status ratios (part-time vs. full-time), and years of

149

experience between the two organisations. The results showed significant differences between the paramedic and nursing organisations in terms of the gender of employees, level of experience and employment status (p<0.05).

Table 10.3 Mean Age Differences between Nursing and Paramedic Organisations

Organisation N Mean SD T p

Nursing 308 45 9.54 7.77 * 0.001

Paramedic 942 40 10.88

Note. * Equal variances were not assumed.

The results of the t-test showed that the mean age difference was significant, with the paramedics being on average younger than the nurses (Table 10.3). It is therefore evident that the two organisations have significantly different demographic characteristics.

10.2 Model Fit Evaluation

In the next step, the model fit was evaluated for the whole population (combined nursing and paramedics) and separately for each organisation. Only if the fitted models described the data well could the reliability assessment proceed.

The bifactor model of WOAQ was assessed separately for each organisation. Table

10.4 shows adequate model fit for the bifactor model for the nursing organisation (RMSEA =

0.04, NNFI = 0.93 as reported in Chapter 7) and the paramedics organisation (RMSEA =

0.05, NNFI = 0.93).

150

Table 10.4 Summary of Model Fit Statistics for the Baseline Bifactor Model of WOAQ across Organisations

Model SB χ2/df AIC CFI RMSEA NNFI

Nurses 1.71 89.66 0.94 0.04 (0.04-.05) 0.93

Paramedics 3.77 565.32 0.93 0.05 (0.05-0.06) 0.93

Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed fit index.

The model fit for both organisations combined (Table 10.5) was also good (RMSEA =

0.03, NNFI = 0.96, CFI = 0.97), so it was appropriate to proceed with the reliability assessment of the model.

Table 10.5 Summary of Model Fit Statistics of the Bifactor Models of WOAQ (n=1257)

Model SB χ2/df RMSEA NNFI CFI

0. Independent Model 52.91

1. Bifactor model 2.83 0.03 (.03-.04) .96 .97

Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed Fit Index.

The model-based reliability coefficients of Omega total, Omega hierarchical, and

Omega subscales were calculated for each organisation separately. As shown in Table 10.6, both organisations demonstrated high reliability for both Omega total and Omega

151

hierarchical, with Omega hierarchical representing the reliability of the general WOAQ factor. The reliability for the Omega subscales of ‘quality of relationship with management’ and ‘reward and recognition’ were also similar for the two samples. However, the reliabilities for ‘the quality of physical environment’, ‘the workload issues’ and ‘the relationships with the colleagues’ were quite different in these two samples. The ‘quality of physical environment

‘and ‘workload issues’ reliabilities were higher for the nurses, while the reliability for the

‘relationships with the colleagues’ construct for the paramedics was almost double that for the nurses.

Table 10.6 WOAQ Reliability Statistics for Nursing (as reported in Chapter 7) and Paramedic Organisations

Nursing Paramedics

Constructs ω ω ω ω ω ω t h s t h s General WOAQ .92 .87 - .94 .89 -

Physical environment .51 .42 Relationships with colleagues .35 .70 Quality of relationships with management .16 .22 Reward & recognition .15 .13 Workload issues .39 .29

The covariate-free and covariate-dependent reliability coefficients are given in Table

10.7. The model-based reliability coefficient rho shows that although the WOAQ is very reliable for the whole sample (coefficient rho = 0.95), some part of the reliability is dependent on organisational type (covariate-dependent coefficient rho = 0.32). Based on the results, the type of organisation accounts for around 33% of the reliability. This indicates that

152

once the organisation type is controlled, there is less consistency left in the WOAQ

(covariate-free coefficient rho = .63). This result suggests that different parameter estimates might be required for the nursing and paramedic samples. This will be tested below.

Table 10.7 WOAQ Reliability Statistics across Organisations (n=1257)

Bifactor WOAQ Combined organisations

Coefficient rho .95

Covariate-dependent rho .32

Covariate-free rho .63

Model 1: Configural model with no constraints. The next step included fitting a configural bifactor model for the nursing and paramedic organisations simultaneously, to determine if the model was appropriate across organisations when no constraints were imposed. Based on the results, the configural model (Table 10.8) showed marginal model fit across organisations (RMSEA = 0.06, NNFI = 0.88), suggesting that there are indeed significant differences in the parameter estimates for these two samples.

Model 2: Invariant loadings. However, after constraining the loadings to be equal for both nursing and paramedics, the results showed good fit (RMSEA = 0.04, NNFI = 0.92).

153

Table 10.8 Invariance Testing Across Organisations for the Bifactor Model of WOAQ

Models SB AIC CFI RMSEA NNFI χ2/df

Model 1 - Configural model-no 3.73 1086.76 .90 .06 (.06-.07) .88 constraints

Model 2 - M1+loadings invariance 2.68 478.16 .93 .04 (.04-.05) .92

Construct mean differences 2.99 698.89 .95 .05 (.05-.06) .94

Note. RMSEA = Root Mean Square Error of Approximation; NNFI = Non-normed Fit Index; CFI = Comparative Fit Index.

Group Differences for Construct Mean. The mean differences for the nursing and paramedic organisations were therefore considered for the general factor of WOAQ and the nested five sub-factors, with the paramedic organisation means selected as the reference category. The construct means for the paramedic organisation were therefore set to zero, while the construct means associated with the nursing were tested, providing an estimate for the mean differences between the groups for all the constructs.

After setting equality constraints on loadings and intercepts for the measured variables, with factor intercepts of zero only for paramedics, the results showed a good fit for the model (RMSEA = 0.05, NNFI = 0.94). The mean differences between nurses and paramedics demonstrated differences primarily on two of the nested constructs (‘co-worker’ and ‘workload issues’). The results showed that the mean scores on ‘the relationships with the co-workers’ were higher for the paramedics than for the nurses, while the mean scores on

‘workload issues’ were lower for the paramedics, confirming that the parameter estimates do

154

differ for these two samples and explaining why the covariate-dependent reliability is so high.

The reasons for these differences will be explored in the discussion in Chapter 11.

But now we return to the second application in which the effects of CMB, measured using a social desirability scale, and CMV due to a single survey source, are evaluated for a student study of emotional intelligence, wellbeing and alcohol drinking behaviour.

155

10.3 Application in Demonstrating CMB using Social Desirability

As explained in the previous chapter, covariate-dependent reliability can be used for common method bias evaluation. This section reports on the results of a different sample

(students) with the purposes of demonstrating possible CMB. The demographic characteristics of the participants are presented in Table 10.9.

Table 10.9 Summary of the Demographic Characteristics of the Participants (n=341)

%

Gender Male 18.18

Female 81.82

Study status: Part-time 2.64

Full-time 97.36

Age (Mean/SD) 20 (3.98)

The overall reliability of the model, with CMB (social desirability) as a covariate, is illustrated in Figure 10.1.

156

Wellbeing

Alcohol F CMB-Social Habits Desirability

Emotional Intelligenc

Figure 10.1. The effects of latent factor bias (social desirability) on the reliability of the constructs. Note. Due to the identification issue, item parcelling was used for the emotional intelligence (EI) construct, allowing four EI subfactors to load as observed variables for this construct.

The effect of social desirability as a source of CMB was assessed by conducting

Bentler’s covariate-dependent and covariate-free reliability assessment procedures. If there was a difference in the reliability coefficients of the constructs after including CMB as a covariate in the model, it means that there is some degree of covariate-dependent reliability, and we can therefore conclude that CMB, in the form of social desirability, has biased the reliability of the constructs.

157

Table 10.10 Comparison of the Covariate-dependent and Covariate-free Reliability Coefficients of the Scales after Including CMB

Overall reliability Covariate- Covariate-

dependent free reliability Reliability Coefficients rho(ωt ) reliability

All Scales .84 .18 .66

As presented in Table10.10, the CMB variable (social desirability) inflated the reliability of the scales by around 27%. Removing the effect of CMB reduced the reliability of the scales to only 0.66. This result suggests that CMB had a remarkable effect on reliability, providing support for its existence in this study. Further analysis has been conducted in order to also test for common method variance (CMV) using the model shown in Figure 10.2.

158

WORN

WELBEING

NERVE

D5 Social ALC1 Des 1

Social ALCOHOL ALC2 CMV CMB* Des 2

Social ALC3 Des 3

EI – FAC

EI - REG

EMOTIONAL INTELLIGEN CE

EI - UND

EI - PER

Figure 10.2. The proposed model for evaluating CMB/CMV. Note. CMV = common method variance, CMB = common method bias (social desirability).

159

Model 1. This is a baseline model illustrated in Figure 10.3, where the three study constructs (well-being, alcohol drinking behaviour, and emotional intelligence) are correlated with each other, but CMV and CMB weights are constrained to zero (i.e., are not controlled for). This is used as a comparison model when there is no control for method bias and variance.

160

WORN

WELBEING

NERVE

D5

ALC1 Social Des 1

Social ALC2 ALCOHOL Des 2 CMV CMB*

Social ALC3 Des 3

EI - FAC

EI - REG

EMOTIONAL INTELLIGENCE

EI - UND

EI - PER

Figure 10.3 . Model 1: Baseline model when all the study constructs are correlated without controlling for CMV and CMB

161

Model 2. The second model, illustrated in Figure 10.4, was compared with the baseline model. This is a constrained model in which CMV and CMB were included but the loadings from CMV to the study indicators were constrained to have equal effects. CMB

(social desirability) was included in the model as a predictor of CMV. It is expected that social desirability would be the main source of bias for the study’s self-rated questionnaire when asking about alcohol drinking behaviour and emotional intelligence skills. However, this model controls specifically for CMB caused by social desirability, as well as other random sources of CMV (e.g. single survey source).

162

WORN

WELBEING

NERVE

D5 Social ALC1 Des 1

Social ALCOHOL ALC2 CMV CMB* Des 2

Social ALC3 Des 3

EI - FAC

EI - REG

EMOTIONAL INTELLIGENCE

EI - UND

EI - PER

Figure 10.4. Model 2. Constrained equal loadings from CMV to the study indicators.

163

Model 3. The third model, illustrated in Figure 10.5, is the same as the previous model but the loadings of CMV to the indicators were allowed to differ. A comparison of the constrained Model 2 and unconstrained Model 3 with the baseline model tests the amount of

CMV for each of the study constructs individually. A comparison of Model 2 (constrained) with Model 3 examines whether the effects of CMV are equal for all three constructs

(wellbeing, alcohol drinking behaviour, and emotional intelligence).

As shown in Table 10.11, both the constrained CMV (Model 2) (SBχ2/df (48) = 2.44,

RMSEA = .07, CFI = .86) and unconstrained CMV (Model 3) (SBχ2/df (39) = 1.45, RMSEA

= .04, CFI = .96) describe the data significantly better than the baseline model, which does not account for any common method variance or bias (SBχ2/df (52) = 3.61, RMSEA = .09, CFI =

.73). These results suggest that social desirability accounts for part of the method bias.

A comparison of Model 2 and Model 3 also showed that the latter, with varying weights from CMV to the indicators, describes the data significantly better than Model 2 with equal indicator loadings for CMV. The results suggest that CMV has different effects on the indicator loadings for the three constructs (wellbeing, alcohol drinking behaviour, and emotional intelligence), perhaps suggesting that social desirability may not be the only source of CMV.

164

WORN

WELBEING

NERVE

D5 Social ALC1 Des 1

Social ALCOHOL ALC2 CMV CMB* Des 2

Socia ALC3 l Des 3

EI – FAC

EI - REG

EMOTIONAL INTELLIGEN CE

EI - UND

EI - PER

Figure 10.5. Model 3. Free loadings from CMV to the study indicators

165

Table 10.11 Summary of Fit Indices of Comparison Models Model χ 2 χ 2 CFI RMSEA Comparison models ∆χ 2 ∆ P SB (df) * df (CI) /df 1. Baseline 188.16 (52) 3.61 .73 .09(.08, .11)

2. Constrained CMV† 117.19 (48) 2.44 .86 .07(.05, .08) Baseline vs. Constrained CMV 70.97 4 <0.001 (1 vs. 2) 3. CMB-CMV 56.62 (39) 1.45 .96 .04(.01,.06) Baseline vs. CMB-CMV(1 v 3) 131.54 13 <0.001 (2 vs. 3) 60.57 9 <0.001 *Satorra-Bentler scaled chi-square; † loadings from CMV set to be equal in this model.

166

Table 10.12 presents the differences between the standardised loadings of the three constructs (wellbeing, emotional intelligence and alcohol drinking behaviour) when

CMV/CMB are controlled. As can be seen, in Model 3, CMV does not have equal effects on the indicator loadings, and, comparing Model 1 and Model 3, emotional intelligence and alcohol drinking behaviours have the most inflated weights when CMV and CMB are not controlled.

Table 10.12 Standardised Factor Loadings for Different Models Compared to Baseline

Indicators Baseline- Constrained CMB-CMV Model 1 CMV-CMB Model 3 Model 2 Wellbeing: worn-out/ exhausted 0.71 0.68 0.70 Wellbeing: nervous/tense 0.85 0.62 0.74 EI-Facilitation 0.40 0.36 0.19 EI-Regulation 0.92 0.71 0.69 EI-Understanding 0.43 0.56 0.26 EI-Perceiving 0.46 0.62 0.29 Alcohol – 1 0.55 0.39 0.11 Alcohol – 2 0.93 0.66 0.25 Alcohol – 3 0.43 0.37 0.09 Social Desirability- 1 0.50 0.61 Social Desirability- 2 0.44 0.37 Social Desirability- 3 0.55 0.48 CMB>CMV path -0.003 0.45

167

This example supports the hypothesis that the SEM-integrated approach allows control for the effects of common method variance and bias due to social desirability as well as other possible sources of CMV.

168

11

STUDY 2: DISCUSSION

An overview of the history of reliability assessment was presented in Chapter 3, starting with the single general coefficient for assessing internal consistency reliability which was published by Cronbach in 1951. The critique of this coefficient and recommendations for improvements were discussed. Although these recommendations may be useful, there are other methods that could be considered In particular, the newly developed covariate-free and covariate-dependent coefficients (Bentler, 2014) provide insight into the internal consistency of scales when covariates are controlled.

The influence of covariates on rho (a model-based reliability coefficient) and on the development of covariate-free coefficients of reliability was described in chapter 9. An empirical study demonstrated the role of organisational type on the reliability of WOAQ in

Chapter 10. The WOAQ is a widely used measure for risk assessment in organisations, based on the identification and collection of employee opinion regarding their work and health (Griffiths, Cox, Karanika, Khan, & Toma, 2006). The scale is relatively short with

28 items. Using another student data set (chapter 10) also demonstrated how the effect of

CMB on reliability could be evaluated using the covariate-dependent and covariate-free reliability measures. In addition, the effects of CMV and CMB on each of the constructs emotional intelligence, wellbeing and alcohol drinking behaviour were compared in

Chapter 10. This chapter provides a discussion for these two applications.

169

11.1 Discussion: Application in Reliability Assessment of WOAQ

In this section, the reliability and covariate-dependent reliability of the WOAQ is discussed for Australian employees in two separate organisations – a community nursing organisation and a paramedic organisation. The WOAQ was validated as a bifactor measure in Chapter 7, including a general measure of WOAQ and five nested subfactors, each representing different dimensions of work organisation risk assessment. The five nested subfactors were: ‘quality of relationships with management’, ‘reward and recognition’,

‘workload issues’, ‘quality of relationships with colleagues’, and ‘quality of physical environment’. Although the employees in both organisations provide clinical services, the nature and demands of the occupations are very different. In chapter 10 the results of

Bentler’s covariate-dependent approach to reliability assessment showed that almost one third of the reliability was accounted for by the type of organisation.

The invariance testing supported this conclusion. It was found that ‘relationship with colleagues’ was much more important to paramedics than to nurses. This can be explained by the nature of work done by the paramedics. The service delivery for paramedics is based on teamwork; without teamwork the quality of work would be affected. Therefore, it makes complete sense that for paramedics their relationship with their colleagues had higher loadings than was the case for home-based nurses.

In contrast, workload issues were less important to paramedics than to nurses. The items captured for the workload issue construct consisted of pace of work, workload, and the impact of work on family and family on work. When comparing the demographic characteristics of both organisations, it is not surprising that the workload issue had less

170

importance for paramedics than for nurses. One main reason is working hours. The majority of paramedics work part-time, while the majority of nurses work full-time. Also the majority of the nurses were female and the majority of the paramedics were males. The literature on work-family conflict demonstrates that women, both as employees and caregivers, tend to be more exposed to the experience of conflict in juggling their work and family demands. It is therefore likely that the impact of work-family conflicts and workload are less overwhelming for paramedics, who are mainly male and work part-time, than for females who are mainly females working full-time.

‘Pace of work’ was another item included in the workload construct. Because paramedics work in a fast-paced work environment, it is possible that individuals interested in such working conditions tend to become paramedics. This suggests that the pace of work does not bother them as much as it does home-based nurses. Both, the home-based nursing staff and the paramedics provide their services outside of their formal organisational settings. However, the nurses’ work sites tend to be within the homes of their clients. The atmosphere is one of trust, as the nurses and their clients interact over a number of clinical treatments over an extended period of time. In contrast, the paramedics have a variety of work sites, ranging from road, school and workplace accidents, through to nursing homes and clients’ homes. Their interaction is by definition urgent and filled with emotion. Often, their clients are unable to interact with the paramedics. For these two seemingly similar organisational types, there are very significant differences in the work environment that influence the responses to both the ‘Your Work’ and ‘Your Well-being’ components of the

WOAQ.

171

The above comparisons show how the type of organisation and the nature of work impact on the way work organisation is assessed. As the example of WOAQ has demonstrated, using the Bentler’s covariate-dependent and covariate-free reliability assessment could have many benefits in practice, allowing scholars and researchers to extract meaningful information from these measures. The results of Study 1 and Study 2 clearly show that the WOAQ is a useful tool for assessing different aspects of work organisations in health settings. However, different types of organisations put different weightings on the parameters assessed in the WOAQ. When assessing WOAQ in a paramedic organisation, more attention should be paid to teamwork and fostering a spirit of teamwork in the organisation. If WOAQ needs to be improved in a home-based nursing setting, the focus should be on workload issues and managing work-life balance.

Therefore, it is very important to consider the possible covariates of reliability to get more precise and meaningful outcomes in assessments. Study 2 presented an example relating to the application of WOAQ, but this procedure has many other potential uses in educational, clinical and/or health settings that need further investigation.

11.2 Discussion: Application in Demonstrating CMB

The other application of Bentler’s covariate-dependent and covariate-free reliability procedure covered in chapter 10 offers new and comprehensive techniques for controlling

CMB and CMV which are common problems in psychological studies. The issue of CMV becomes more noticeable when researchers rely on a single source of data collection and

CMB becomes more noticeable when self-report measures are used. In Chapter 10 the covariate-dependent reliability procedure was used to demonstrate the effect of CMB and

172

CMV on scale reliability. The effects of social desirability was evaluated as a covariate in the student study of emotional intelligence, wellbeing and alcohol drinking behaviour. The results showed that around 27% of the scale reliability was influenced by CMB as measured by social desirability. A SEM approach introduced by Williams, Hartman, and

Cavazotte (2003) and Podsakoff et al., (2003), and further developed by other researchers

(e.g., Richardson et al., 2009; Williams et al., 2010), was then used for assessing the effect of CMV and CMB on each of the study constructs, emotional intelligence, wellbeing and alcohol drinking behaviour.

The results produced two main findings:

a) CMB (due to social desirability) appears to inflate the reliability measures of the

scales.

b) The measures of emotional intelligence and alcohol drinking habits were more

influenced by CMB and CMV than wellbeing.

Consistent with previous findings (e.g., Podsakoff et al., 2003; Richardson et al.,

2009; Williams et al., 2010), it seems that the CFA approach provides a practical method for controlling for method variances and biases. The findings also demonstrate that forcing equal CMV effects for all measures is not appropriate because it adversely affects the model fit. The findings showed that CMV effects differ depending on the nature of each measure; CMV weights should therefore be allowed to vary. This result is similar to the finding of Williams et al., (2010) but contradicts the equivalent method effects technique proposed by Lindell and Whitney (2001).

173

11.3 Strengths

The research conducted in Study 2 and reported in chapters 9 to 11 was underpinned by several strengths. First of all, to the best of the researcher’s knowledge, this study was the first of its kind to demonstrate covariate-dependent reliability empirically. While methods of reliability generalization (Vacha-Haase, 1998; Vacha-Haase & Thompson,

2011) have been proposed to study variation in reliability, reliability generalization is a meta-analysis methodology requiring data from a large number of studies. In contrast

Bentler’s (2014) new procedure for covariate-dependent and covariate-free reliability coefficients requires only a single study and can provide more accurate coefficients of reliability with estimates of how these are affected by group characteristics. This approach can be adopted for the conventional internal consistency measure (coefficient α), as well as model based reliability for multi-dimensional studies. In Study 2 this method was initially adopted in the context of a bifactor model. Using Omega hierarchical and Omega subscales the reliability of a general factor and its subfactors were assessed. Then using a covariate- dependent reliability assessment the between group variation with regard to reliability was assessed. This is a novel approach and has application potential whenever the reliability of a given scale might be affected by grouping variations.

Secondly, this study has shown that the type of organisation influences the reliability of assessments such as the WOAQ that measure the psychosocial and physical aspects of an organisation. This finding introduces a novel area of research that needs further exploration.

174

Thirdly, this study could be considered as one of the first of its kind that demonstrates the application of covariate-dependent and covariate-free reliability in assessing CMB effects on reliability. The procedure appears to provide a very comprehensive and simple quantification of the method effects in self-reported studies of this kind. More studies of this type are needed to provide a comprehensive understanding of common method effects on the reliability of measures. The proposed covariate-dependent and covariate-free reliability procedure can be easily calculated using EQS.

Fourthly, this is one of the first studies that integrates a measured and unmeasured latent variable procedure (Podsakoff et al., 2003), controlling for CMV and CMB. The procedure appears to provide a very comprehensive way of controlling method effects in self-reported studies. However, there is a need for further studies in order to shed more light on this procedure.

11.4 Limitations and Directions for Future Research

Despite the above strengths, this study is not without weaknesses and limitations.

One of the limitations in this study is the lack of background literature on covariate- dependent reliability. This makes it difficult to evaluate and compare the results with other studies. Further studies are required to expand this new, practical area of research.

In this study, only two occupations in the health field were compared in order to assess the covariate-dependent reliability of the WOAQ. Further studies are needed to expand the concept to white, blue, and pink collar workers, as well as other health professional occupations.

175

Covariate-dependent reliability may have practical implications for cultural comparisons using the WOAQ or other similar scales. Therefore, future studies could consider the role of culture as a covariate in the assessment of reliability of such scales.

The marker variable choice for CMB (in this study, social desirability) is controversial. Some scholars believe that the marker variable should not have any relationship with other substantive constructs while others believe the opposite (e.g.,

Lindell and Whitney, 2001; Richardson et al., 2009; Williams et al., 2010). In their review of previous studies, Williams et al., (2010) demonstrated that researchers use a broad range of variables as marker variables for CMB. They concluded that “... no consideration has been given to the role of theory associated with method processes to guide the selection of marker variables and the understanding of their effects” (p. 505). Further studies are needed to determine the best criteria for choosing an appropriate marker variable when controlling for CMB.

176

12

STUDY 3: MODEL-BASED RELIABILITY AND VALIDITY OF

REFLECTIVE-FORMATIVE MODEL OF WAS USING PLS-SEM

All studies discussed in the previous chapters had constructs measured using reflective models. The literature rarely reports formative-formative or reflective-formative measurement models. The purpose of Study 3 is to demonstrate the validity and model- based reliability of a reflective-formative model of the Work Ability Scale (WAS). The results are compared with the misspecified reflective forms of the WAS model to highlight the possible errors that occur because of misspecification of measurement models. The results of Covariance-Based Structural Equation Modelling (CB-SEM) and Partial Least

Squares SEM procedures are presented in Chapter 13. The Discussion in Chapter 14 presents the practical implications of these findings. This chapter introduces this study.

12.1 Rationale and Objectives

In Australia, the Work Ability Survey (WAS) and the Work Ability Survey

Revised (WAS-R) were developed by the Business, Work and Ageing Research Centre, at

Swinburne University of Technology, Melbourne (Taylor & McLoughlin, Dec 2011). As described in Chapter 5, a decision-making tree has been developed for distinguishing formative from reflective models (Figure 5.5, Chapter 5),

As explained below this decision-making tree suggests that the WAS should be specified as reflective in the first-order and formative at higher orders (i.e. a reflective- formative model).

177

12.1.1 Empirical Example: The Study of Work Ability

This section presents background on an empirical example (study 3) comparing the results of a reflective-reflective higher order measurement model for work ability scale, fitted using CB-SEM, and corresponding reflective-reflective, reflective-formative and formative-formative work ability models fitted using the PLS SEM procedure. Before discussing the methodology used, there is a need to fully explain the theoretical background on which this model is based. As demonstrated in the decision-making framework in

Chapter 5 (Figure 5.5), the first step is to evaluate the background theory of the measure to find out if the work ability model has previously been considered as reflective or formative.

Therefore, a review of the empirical and theoretical background of work ability will be provided next.

12.1.2 History of work ability research

More than thirty years of international research on work ability and age management have provided proof that the working life can be improved and extended.

Work ability is predominantly about work-life balance. In the early 1980s, research on work ability started in Finland to determine the length of people’s working life and how this is affected by work contentment and job demands (Ilmarinen, 2009). Through the years, work ability conceptualisation has progressed to become more holistic. The history of work ability research can be divided into three phases: (1) Evolution (1980 – 1989), (2)

Conceptualisation and Implementation (1990 – 1999), and (3) Internationalisation (2000 – present). A brief description of each phase will be presented next.

178

Evolution (1980 – 1989). Work ability in the 1980s was defined as “How good is the worker at present and in the near future, and how able is he or she to do his or her work with respect to work demands, health and mental resources?” (Illmarinen, 2003; p. 3).

This was a period of longitudinal research, driven primarily by the question as to what would happen employment-wise to the post-war baby boomers in the 1990s as they started approaching retirement. The research also examined the extent of people’s working life span and retirement (Ilmarinen, 2010).

Based on this positive approach, a multidisciplinary team of scientists constructed and validated the Work Ability Index (WAI) and, using a stress/strain concept, they applied and evaluated the work ability index on a large number of participants between 1981 and

2009 (llmarinen, 1991).

Conceptualisation and Implementation (1990 – 1999). The main characteristic of the research during this period was the large number of longitudinal studies of men and women who worked in the same occupation throughout the entire study period. The aim of these longitudinal studies was to find a way to prevent disease and disability among workers who were approaching retirement. Concurrently, the researchers were seeking a way to maintain workers’ health and work ability (Tuomi, Ilmarinen, Klockars, Nygård,

Seitsamo, & Huuhtanen, 1997). Emphasis was on changes in work, lifestyle, health, stress symptoms and work ability, as well as on the causes of any change. Changes were analysed based on age, gender, work contentment and work profile. The occupations of the participants were divided into physical work, mental work, and both physically and

179

mentally demanding work (Tuomi, Ilmarinen, Seitsamo, Huuhtanen, Martikainen &

Klockars, 1997).

The results highlighted that the different interactions between biological ageing, health, lifestyle and work strongly affect work ability. But it appeared that, in general, work ability decreases with age (Ilmarinen, Tuomi, & Klockars, 1997).

Even though a decline was observed in the work ability of the participants with age, the initial age did not explain observed differences in the magnitude of these changes in the participants’ work ability. The authors suggested that, in order to improve work ability, there is a need for better supervisor attitudes, increasing variety at work, leisure and physical activity (Tuomi, Ilmarinen, Martikainen, Aalto, & Klockars, 1997). It seemed that while the work ability of senior employees usually declined with age, the work ability of employees could be improved regardless of their age.

It was also found that the mean WAI improved among 10 per cent of the participants and declined dramatically among 30 per cent. For 60 per cent of participants, the index was steady at a good or excellent level (Tuomi, 1997). Based on a logistic regression analysis, it was found that factors relating to lifestyle, management, and ergonomics explained both positive and negative changes in work ability (Ilmarinen et al.,

1997).

The outcomes of the research had a profound impact in Finland. The Finnish social partners made an agreement to promote and maintain work ability in workplaces. A work ability measure was created and validated, and health professionals including physicians and nurses were trained in the application of the WAI (Ilmarinen, 2009).

180

The study showed that the behaviours of managers and supervisors are among the most critical factors influencing work ability. Also, improved work ability of ageing employees and workers was directly related to age awareness (Ilmarinen, 2010). Based on the study results, a focus on age management became popular in the early 1990s, and training in age management started shortly afterwards. This developed into an international course on age management which is still running (Ilmarinen, 2010).

Internationalisation (2000 – present). The original WAI was translated into many languages in the early 1990s. The international validation of the index showed good results.

The psychometric properties of the scale and its predictive ability and cultural appropriateness have been acknowledged to be constant across Europe (Gould, Ilmarinen,

Järvisalo, & Koskinen, 2008).

The global use of the original WAI provides excellent possibilities for international networks and databases related to the index. This allows new possibilities for research, which will strengthen WAI networks worldwide.

However, the work ability concept has changed over time. Current multidimensional work ability theory focuses on the promotion of longer and healthier careers with employment growth and improved wellbeing of the population until retirement and beyond. Today, work ability is related to nearly all factors of work and life including work-related, individual and social factors (Gould et al., 2008). These connections to most aspects of daily living make the definition of work ability challenging and its promotion demanding.

181

Since the 1980s, a large amount of research which focused on work ability and its related factors has helped in the understanding of work ability and its complex relationship with these factors. The growing importance of work ability research and applications is also due to changes in the organisation of work and wider societal and population trends across the world. In order to preserve work ability, it is essential to strive for a healthy work-life balance (Gould et al., 2008).

There are several other indicators of work ability available; however, the original

Work Ability Index is by far the most widely used measure. In a three-level assessment of work ability, participants evaluate their current work ability regardless of whether they work. They may be completely fit for work, partially disabled for work, or completely disabled for work. The score is usually referred to as the ‘work ability estimate’, and ranges from 0 to 10. (Gould et al., 2008). A range of scores from 0 to 10 indicates full work disability to best work ability. In the next section the current WAI, which incorporates the original work ability estimate, is explained briefly.

The current Work Ability Index (WAI). The Finnish Institute of Occupational

Health originally developed the current index as a tool to predict retirement age and to record the work ability of employees. It was designed to identify the health risks of employees at an early stage and to highlight the risks of early retirement so as to avoid these risks (Morschhäuser & Sochert, 2006). The WAI validity was tested using clinical studies for many years. It has been used for years in occupational health and safety research and practice in order to investigate the association between human resources and other

182

work-life factors, as well as to compare work ability in different age groups (Ilmarinen,

Tuomi, & Seitsamo, June 2005).

The index involves a self-assessment questionnaire and has a strong focus on health status, resources and the subjective estimation of work ability (Gould et al., 2008). It is based on questions that incorporate both the physical and mental demands of an employee’s work (Tuomi, Ilmarinen, Jahkola, Katajarinne, & Tulkki, 2006). In the original study, after completing the questionnaire, each employee was interviewed by an occupational health professional. Based on the assessment, an evaluation was made as to whether there could be any restriction or improvement on the employee’s current work ability in the future (Tuomi et al., 2006).

The WAI has seven items (See Table 12.1), with a total score ranging from 7 to 49.

There are four categories derived from the WAI score, reflecting poor work ability (7 – 27 points), moderate work ability (28 – 36 points), good work ability (37 – 43 points) and excellent work ability (44 – 49 points) (Martus, Jakob, Rose, Seibt, & Freude, 2010). The score refers not only to the employee’s current status of work ability but also provides some information on health-related risk factors. The results give an indication as to whether the appropriate strategy is to maintain the current work ability, improve it and support it, or re- establish it. According to Ilmarinen, the WAI is capable of reliably predicting work disability, retirement and mortality (Ilmarinen, 2007).

183

Table 12.1 Items of the Work Ability Index

Items Range

1 Current work ability compared with the lifetime best 0 – 10 2 Work ability in relation to the demands of the job 2 – 10 3 Number of current diseases diagnosed by a physician 1 – 7 4 Estimated work impairment due to diseases 1 – 6 5 Sick leave during the past year (12 months) 1 – 5 6 Own prognosis of work ability two years from now 1 – 7 7 Mental resources 1 – 4

Note: Reprinted from Ilmarinen, J. (2007). The Work Ability Index (WAI). Occupational

Medicine, 57, p. 160

The index is analysed based on two factors. The first factor reflects a subjective assessment of current and future work ability. The second factor reflects objective data regarding health status and sick leave (Radkiewicz, Widerszal-Bazyl, 2005 & the NEXT-

Study group, 2005). Items one, two, six and seven measure the subjective component of the index. The third, fourth and fifth items measure the objective component of the index, based on the occurrence or absence of different illnesses listed in the questionnaire.

The WAI is easy to use. It takes about ten to fifteen minutes to administer the questionnaire and a further three to five minutes for evaluation. (Ilmarinen & Tuomi, 2004).

It is highly recommended that participation is voluntary because the WAI surveys provide confidential data on an employee’s illnesses and work ability. This means that data protection must be strictly observed.

184

There are several other instruments designed to assess work ability or health-related risk factors. While most instruments focus on labour and human resources policies, the advantage of the WAI is that it concentrates directly on the employee's self-assessed work ability (Morschhäuser & Sochert, 2006).

The reliability of the index was further analysed in the Netherlands using a test- retest evaluation within a four-week interval. The results indicated that 25 per cent of participants achieved the same WAI score on both measurements. The average test and retest results were also similar indicating scale reliability (de Zwart, Frings-Dresen & van

Duivenbooden, 2002).

Validity and reliability for this index have been assessed using correlation analyses.

Other psychometric properties of the WAI have also been tested, including internal and predictive validity, and the results have been published in peer-reviewed literature and reports (European Network for Workplace Health Promotion (ENWHP) & National Work

Ability Index Network, 2012).

For example, de Zwart, Frings-Dresen, & van Duivenbooden ( 2002) explored work ability among well-educated professionals, while another study (Pensola, Järvikoski, &

Järvisalo, 2008) looked at unemployment and work ability. Long unemployment, poverty, and a lack of education are well known risks for marginalisation, and the results showed clearly that unemployed individuals had more limited work ability than those who were employed. Work ability scores were directly related to the extent of unemployment such as its length and frequency. It was noted that part of the relationship between unemployment and limited work ability was linked to economic difficulties, especially among the long-

185

term unemployed. In a 1991 study, subjective assessments reported via the WAI questionnaire were compared with clinical examinations including cardiovascular, musculoskeletal, and psychological measurements. The clinical examinations for both male and female workers were selected according to health and subjective work ability as reported by the questionnaire. The researchers found that the results suggested a relationship between the level of work ability and other clinically assessed factors. There were some individual differences observed, but they were explained on the basis of the available data (Eskelinen, Kohvakka, Merisalo, Hurri, & Wagar, 1991).

The WAI can be used for individuals and groups, or even an entire company

(Morschhäuser & Sochert, 2006). A selected review panel from European countries stated that the index is a useful, valid, and reliable tool that addresses a very relevant issue in the workplace (European Network for Workplace Health Promotion (ENWHP) & National

WAI Network, 2012). They also viewed the index as a powerful predictive tool for premature retirement. Organisations can implement strategies to moderate the risk of early retirement based on the item responses for this index. However, the panel highlighted a few challenges in terms of practical applicability and this has led to the development of new multidimensional work ability models.

Multidimensional work ability model. With a large amount of research undertaken internationally, there have been substantial changes to the work ability concept and the conceptual models used to describe work ability. At the beginning of this development, the aim was to predict retirement age and to try to find out how long people are able to continue working after retirement, and what role work satisfaction and job

186

demands play in determining these factors. Health status was viewed as the most important component of an individual’s functional capacity. With the development of the concepts of work ability in a more holistic direction, consensus grew that work ability could not be analysed individually and that there was a need for a conceptual shift to more of a life-work balance model of work ability (Gould et al., 2008).

In early 2000, the Finnish Institute of Occupational Health in Helsinki introduced a more advanced model of work ability. It is based on studies and development projects conducted in the 1990s on occupational wellbeing in different industrial sectors and among different age groups. The multidimensional image of work ability includes both individual resources as well as work-related and personal factors (Finnish Institute of Occupational

Health, 2011). The dimensions of work ability are presented in the form of a ‘work ability house’.

The factors influencing work ability represent four floors in a house (Figure 12.1).

The first floor includes human resources such as health - physical, mental, and social functioning. If the first floor is strong, the chances are that a person will have stronger work ability throughout his or her working life.

The second floor of the house contains knowledge and skills and their constant updating, including education and relevant training. The third floor refers to the inner values and attitudes and also to circumstances that motivate people at work. Work environment is located on the fourth floor, right above attitudes because it directly affects attitudes. When a person is exposed to good experiences, his or her positive values and attitudes towards work are strengthened. On the other hand, bad experiences weaken both

187

attitudes and values (Finnish Institute of Occupational Health, 2011; Ilmarinen, 2010). As clearly presented in the work ability house, work ability is formed by the work environment as well as personal health and abilities.

Figure 12.1. Multidimensional work ability model. Reprinted from Finnish Institute of Occupational Health. (2011). Multidimensional work ability model. Helsinki, Finland, p. 1.

Outside the work ability house are additional influences on work ability.

Community organisations that support work, occupational health care and safety, as well as 188

the immediate social environment (family, friends, relatives, etc.) are also important.

Finally, the operational environment of work is added, including society, culture, social and health policies and legislation. Government policies contribute to creating significant prerequisites for work ability, but they also create challenges for work ability, such as demanding a higher employment rate. Evidence shows that the core structure of work ability is very dynamic and can change greatly during a person’s career. Any conflict between family life and work life will have an impact on work ability. Also, support or lack of support from the community will affect one’s work ability. Likewise, the introduction of new technologies, the impact of globalisation, or changes in retirement/health/welfare systems and legislation status will make a difference to work ability (Gould et al., 2008).

The multidimensional work ability model is very versatile and can be applied to planning research and developmental projects, as well as training and education programs

(Ilmarinen, 2010).

Work ability in Australia. The work ability index has been used in Australia for more than ten years. The major reason researchers are interested in its application is the ageing population, and the need to enhance health and labour systems (Taylor, 2010). Such considerations have caused policymakers to rethink the length of working lives. As

Australia faces a skills shortage and an ageing workforce, the focus is on finding answers to the following question: “How can we tap into the available talent in the workforce and remove the barriers to a life in work?” (Australian Government, Compare, 2011).

Researchers in Australia have used the WAI for different purposes such as predicting employees’ retirement intentions (Oakman & Wells, 2009), predicting work

189

ability of employees (Palermo, Webber, Smith, & Khor, 2009), and examining the relationship between age and work ability (Webber, Smith, & Scott, 2006).

However, while individual factors remain significant predictors of work ability,

Palermo (2010) has found that other organisational factors such as occupational stress, job satisfaction, leadership effectiveness and the nurturing of workers are significant positive predictors of work ability. The outcomes of this study strongly support the Finnish findings that managers and supervisors play key roles in influencing work ability (Ilmarinen, 2010).

Organisations that advocate and endorse caring values for others are more likely to return a better work ability score.

Figure 12.2. WAI scores: Australia and Finland. Reprinted from Taylor, P., & McLoughlin, C. C. (Dec 2011). Pilot Study on Workability. Monash University. Unpublished presentation. Melbourne, p. 10.

Australian studies have compared the predictive power of retirement intentions, investigating the connection between age, injury proneness and work ability, and assessing

190

the influence of organisational values on work ability. Three predictors of the WAI accounted for 42 per cent of its variation. These variables were “management respects you”, “working beyond physical capacity” and “unevenly distributed work” (Brooke,

Goodall, & Mawren, 2010). A surprising finding in these studies was an extremely high mean work ability score (Figure 12.2) compared to the Finnish population (Taylor, &

McLoughlin, Dec 2011; Palermo et al., 2009). These scores had a more negatively skewed distribution compared to the Finnish distribution, even though the population studied varied from private to public organisations, across locations and across different industries.

In Australia, a Work Ability Survey (WAS) and the Work Ability Survey Revised

(WAS-R) were instruments developed in four companies by the Business, Work and

Ageing Research Centre, Melbourne (Taylor & McLoughlin, Dec 2011). These authors have kindly provided their data for use in this study. WAS is an organisational survey that is aligned with the four levels of the multidimensional work ability model, as well as the

WAI described above. It consists of physical and psychosocial work demand measures. The original model (McLoughlin, 2009; Taylor & McLoughlin, Dec 2011) was specified as a higher order reflective-reflective model as illustrated in Figure 12.4.

The personal and organisational capacities are two independent constructs that jointly form the WAS. However, the six factors contributing to the organisational capacity construct and the five factors contributing to the personal capacity construct are expected to be correlated, suggesting that a reflective-formative specification of this model would have been preferable to the originally specified reflective-reflective format. Considering the

191

theoretical background and other criteria demonstrated in Figure 5.5, it is confirmed that the WAS should be modeled as a reflective-formative model.

The major aim of this study was to demonstrate the validity and model-based reliability of a correctly specified reflective-formative model of WAS. It was hypothesised that:

Hypothesis 12.1: A reflective-formative higher order model of WAS has acceptable validity and model-based reliability.

The review of the misspecification literature in Chapter 5 showed that misspecification of measurement models is common. As mentioned in that chapter, misspecification can lead to Type I and II errors. A Type I error is a false positive error that occurs when a path is declared significant when it is not (incorrect rejection of a true null hypothesis). A Type II error is a false negative error that occurs when declaring a path to be nonsignificant when it is significant (failure to reject a false null hypothesis). In SEM, Type

I errors may result from the erroneous application of a reflective model instead of a formative model, while a Type II error can occur with the erroneous application of formative models in place of reflective models (Jarvis et al., 2003; MacKenzie, Podsakoff,

& Jarvis, 2005).

A study by Petter et al., (2007) considered a series of simulations for structural models that contained no significant paths. They found that when the formative construct was misspecified as reflective, upward bias in the parameter estimates often produced a

Type I error. Roy, Tarafdar, Ragu-Nathan & Marsillac (2012) presented similar results.

They reported that misspecifying a reflective model as formative leads to a deflation of path 192

coefficients and R square values (Type II error). Conversely, while Petter et al., (2007) found that misspecifying a formative model as reflective results in the inflation of path coefficients and R square values (Type I error). Petter et al., (2007; p. 631) stated that “The danger of Type I error is that we, as researchers, may build new theories and models based on prior research that finds support for a given relationship that does not actually exist. This may affect the implications of our research for both academia and practice. The danger of

Type II error is that some interesting, valuable research may not be published if many of the relationships within the model are found to be nonsignificant”. In a misspecified model such as the reflective-reflective WAS (Figure 12.4), the variance of the constructs will increase due to shared error. As a result, the path coefficients to the higher order constructs will be increased, creating an upward bias in the result. The opposite will happen in a misspecified formative-formative model of WAS, resulting in a downward bias. To the best of this researcher’s knowledge, no study has investigated the consequences of misspecifying a mixed model (reflective-formative and formative-reflective).

The secondary aim of this study was to increase the awareness of the misspecification problem by demonstrating the possible consequences of model misspecification in an empirical study. The correctly specified reflective-formative model for WAS was therefore compared with the misspecified reflective-reflective and formative- formative models, fitted using Partial Least Squares SEM in order to quantify any Type I or

II errors.

The Partial Least Squares SEM was used to evaluate the WAS models for several reasons. First, evaluating the WAS reflective-formative model using the Conventional

193

Covariance-Based SEM procedure was very problematic due to identification problems. As asserted by Bollen and Lennox (1991), a reflective measure can be easily identified and evaluated using Covariance-Based SEM, while a formative measure cannot be easily identified, except by placing the measure in a larger path structure with other variables that can be evaluated (e.g. using a MIMIC model). Partial Least Squares SEM procedure is a better alternative for evaluating models with formative measures; these can be simply evaluated in isolation using this procedure. There is also less restriction in terms of normality or sample size compared to Covariance-Based SEM (Roy et al., 2012).

The PLS-SEM results for the correctly specified reflective-formative model were compared with the misspecified reflective-reflective WAS model evaluated using CB-SEM, to evaluate the consequences of the misspecification, allowing the testing of the following hypothesis

Hypothesis 12.2: The results for a misspecified reflective-reflective model (fitted

using Covariance-Based SEM) will demonstrate inflated loadings, compared to a correctly

specified reflective-formative WAS model (fitted using Partial Least Squares SEM).

The methodological procedure for fitting a reflective-formative model in Partial

Least Squares SEM is demonstrated in detail below.

194

Q 1

Q 2 CONTROL Q 3

Q 4

Q 5 TRUST Q 6

Q 7

Q 8 RESPECT Q 9 ORGANISATIONAL Q 10

Q 11 SUPPORT Q 12

Q 13

Q 14 HARASSMENT Q 15

Q 16 WAS

Q 17 TRAINING

Q 18

Q 19

Q 20 MENTAL H

Q 21

Q 22 PERSONAL

Q 23 PHYSICAL H

Q 24

Q 25 WORK- Q 26 HOME Q 27

Q 28 HOME- Q 29 WORK Q 30

Q 31

Q 32 LEISURE

Q 33 Q 34 Figure 12.3. The correctly specified reflective-formative model of WAS.

195

Q 1

Q 2 CONTROL Q 3

Q 4

Q 5 TRUST Q 6

Q 7

Q 8 RESPECT Q 9 ORGANISATIONAL Q 10

Q 11 SUPPORT Q 12

Q 13

Q 14 HARASSME NT Q 15

Q 16 WAS

Q 17 TRAINING

Q 18

Q 19 MENTAL H Q 20

Q 21

Q 22 PERSONAL

Q 23 PHYSICAL H Q 24

Q 25 WORK- Q 26 HOME Q 27

Q 28 HOME- Q 29 WORK Q 30

Q 31

Q 32 LEISURE

Q 33

Q 34

Figure 12.4. The misspecified reflective-reflective model of WAS

196

Q 1 Q 2 CONTROL Q 3

Q 4

Q 5 TRUST Q 6

Q 7

Q 8 RESPECT Q 9 ORGANISATIONAL Q 10

Q 11 SUPPORT

Q 12

Q 13

Q 14 HARASSM ENT Q 15 WAS Q 16 Q 17 TRAINING

Q 18

Q 19

Q 20 MENTAL H Q 21 Q 22 PERSONAL

Q 23 PHYSICAL H Q 24

Q 25 WORK- Q 26 HOME Q 27

Q 28 HOME- Q 29 WORK Q 30

Q 31

Q 32 LEISURE

Q 33 . Q 34

Figure 12.5. The misspecified formative-formative model of WAS.

197

12.1.3 Composite reliability using PLS

All the model-based reliability assessments mentioned in study 1 and 2, require the use of reflective model and covariance-based SEM (CB-SEM). CB-SEM uses Maximum likelihood (ML) estimation. Partial Least Squares (PLS) requires an alternative model estimation approach called partial least squares estimation.

In the absence of normality or when the sample size is small, PLS-SEM seems to be an appropriate alternative to CB-SEM for computing model-based reliability coefficients.

PLS-SEM is considered to be a correct and feasible method for estimating formative or reflective-formative models. Usually models involving formative constructs present identification problems and are difficult to evaluate using CB-SEM, while PLS-SEM is commonly regarded as a good tool for evaluating such models. An additional advantage of

PL-SEM is acknowledged when developing measurements with new theoretical or empirical backgrounds (Ridgon, 2012), PLS-SEM seems to provide a more appropriate procedure for reliability assessments in this case. Pro-PLS scholars believe that by using research data, one can help in building empirical background and unobservable conceptual variables (Ridgon, 2012). On the other hand, CB-SEM followers believe that one should specify a conceptual structure and seek evidence regarding whether these structures are consistent with empirical evidence, so that results can challenge, support, or modify those conceptualizations (please see previous chapter for more details on PLS-SEM vs. CB-

SEM).

Despite the less restrictive nature of PLS-based SEM, it is still not as popular as covariate-dependent SEM in model-based reliability assessments. The main reason for this

198

previously was a lack of software for model estimation, but this problem is now being addressed. Since 1984, and especially from the early 2000s, more user-friendly software has been introduced for the estimation of PLS-based SEM, adding to the popularity of the method.

Built on classical test theory and using PLS-SEM, Composite Reliability can be estimated for constructs (Werts, Linn & Jo¨reskog, 1974). Composite reliability (CR) is the reliability of multiple constructs with similar items. In other words, CR is the total true score variance extracted over the total scale variance.

The CR will be equal to coefficient alpha when the essential tau-equivalency of all items are met, otherwise CR is usually higher than coefficient alpha.

The reliability of reflective measures using PLS can be tested using Composite

ρ Reliability ( c ) (Werts, Linn & Jo¨reskog, 1974). Composite reliability takes into account the different outer loadings of the indicator variables in a model and therefore seems to better reflect the model-based reliability compared to internal consistency coefficients such as Coefficient alpha (Hair et al., 2014). Values of 0.60-0.70 or higher are acceptable for CR for early stages of scale developments, and values of 0.80 and higher are satisfactory for more developed (established) measures (Nunally & Bernstein, 1994).

To test the reliability of constructs, some scholars (Chin 1998; Hair et al., 2014;

Fornell & Larcker, 1981) suggest reporting not only the Composite Reliability of the scale but also the reliability of each indicator (since the reliability of each indicator may differ) as

199

well as the Average Variance extracted (AVE), which measures how much indicator variance is explained by the common factors.

As before, the convergent validity of the indicators of a construct is defined as “the extent to which a measure correlates positively with alternative measures of the same construct” (Hair et al., 2014, p 102) and the average variance extracted (AVEs) can be used to test for convergent validity (Fornell & Larcker, 1981) with a cutoff point of greater than

0.50 required. In addition, if the square roots of AVE exceeds the estimates of the intercorrelation of the construct with other constructs, discriminant validity is supported

(Chin 1998; Fornell & Larcker 1981). Reporting the Composite reliability as the reliability of a summated scale is needed as much as the average variance extracted (Fornell &

Larcker, 1981).

LISREL does not output the CR directly and some manual calculation needs to be done in order to obtain CR. However, Smart PLS reports not only the reliability at item level and the CR at scale level but also the AVEs, all in one single analysis. In addition, using SmartPLS, the confidence interval of the composite reliability can be estimated by a bootstrapping procedure. This allows the testing of the hypothesis that the reliability coefficient exceeds a specified value in the population.

200

12.2 Method

12.2.1 Participants

The data for the present study was obtained from the Redesigning Work for an

Ageing Society research program conducted by the Business, Work & Ageing Centre for

Research (BWA) at Swinburne University of Technology in Melbourne. The data was collected from four case study organisations during 2007-2008 with an overall response rate of around 40% (a total sample of 1687 respondents). The final data used in the present study contained 1344 respondents, allowing for the removal of 343 incomplete survey responses.

12.2.2 Measure

The Redesigning Work for an Ageing Society research program developed the

Work Ability Survey (WAS), through the works of McLoughlin (2009), Taylor, and

McLoughlin (2011). The WAS has two main sub-constructs entitled personal and organisational capacities. The organisational capacities scale consists of six subconstructs: control, respect, trust, support, harassment, and training. The personal capacities has five subconstructs: leisure, work-home balance, home-work balance, mental health, and physical health. A version of the questionnaire is presented in Appendix E with the permission from the researchers involved in the original study.

12.2.3 Ethics

The original study obtained ethics clearance from the participating organisations and permission to reuse the database in similar studies.

201

12.2.4 Overview of analysis

Covariance-Based SEM analysis using AMOS software, and a Partial Least

Squares SEM using SmartPLS (v2.0), was used to assess the reflective-reflective model of

WAS. An overview of the similarities and differences between Covariance-Based SEM and

Partial Least Squares SEM follows.

Debates regarding the superiority of Covariance-Based SEM over Partial Least

Squares SEM have existed since the early years of development of these procedures (see

Chapter 2 for details on the origins of each procedure). In particular, some scholars have questioned the practicality and generalisation of the PLS method for factor estimation.

In spite of the wide criticism of Partial Least Squares in the literature, PLS has specific strengths in specific situations that some Covariance-Based SEM scholars have misunderstood or ignored. A comparison of some of the main features of both approaches, along with some of the criticisms follows.

Predicting validity. The literature shows that PLS has capability as a prediction tool, a fact that has not been fully appreciated. As such PLS provides a correct method for evaluating formative constructs and for developing measurements with new theoretical or empirical backgrounds (Ridgon, 2012). Scholars supporting PLS believe that using research data allows the building of empirically-based theory and constructs (Ridgon,

2012). On the other hand, Covariance-Based SEM followers believe that theory is needed to specify a conceptual structure, while research data is needed to test whether these

202

structures are consistent with empirical evidence. The argument is that the results can challenge, support, or modify those conceptualisations.

Fit assessment test. Covariance-Based SEM assesses the overall fit of the model using the covariance among the items, assuming that all measures are reflective, with less interest in the individual effects of construct or path coefficients. In contrast, PLS does not rely on item covariance and overall goodness-of-fit; instead, the focus is on the variances of predicted variables or construct variances (Chin, 2010). In practice, in the presence of formative constructs, PLS might be a better choice than Covariance-Based SEM. Indeed, as explained below, Covariance-Based SEM cannot be used for third-order models, such as the WAS model considered here.

Theoretical background. Due to the holistic and confirmatory approach of

Covariance-Based SEM, it is more appropriate when there is solid theoretical and background knowledge of the model. In contrast, a Partial Least Squares approach, with its exploratory nature and focus on the significance and strengths of individual paths and constructs, seems to be an appropriate procedure for new models. It is particularly useful in social and behavioural sciences when the background knowledge of the expected model is limited (Chin & Newsted, 1999; Chin, 2010; Roldán & Sánchez-Franco, 2012).

Normality assumption. Covariance-Based SEM commonly uses ML estimation assuming a normal distribution for the data, while for PLS there is no underlying assumption for the data distribution. This indicates that for non-normal data, the use of variance-based PLS is justified when sample sizes are too small to allow asymptotically distribution-free Covariance-Based SEM or bootstrap analyses.

203

Sample size. One of the requirements for using Covariance-Based SEM is to have a relatively large sample size, while PLS can be conducted with small sample sizes.

However, in PLS, the estimators are inconsistent and biased in that standard errors do not decline with increasing sample size and expected parameter estimates do not converge to their true values. This lack of consistency means that increasing sample size does not provide a more reliable analysis in the case of Partial Least Squares SEM. However, in

Covariance-Based SEM models, if the underlying assumptions are met, consistency is ensured and larger sample sizes do provide a more reliable analysis.

Reflective and Reflective-formative models. Partial Least Squares SEM and

Covariance-Based SEM are two different approaches for estimating a SEM model and both can be used to fit reflective models. PLS can also be used to fit reflective-formative models and formative-formative models. However, Covariance-Based SEM can only be used to fit reflective-formative models when there is a reliable measure for the higher-order latent constructs (using MIMIC models). Each approach is suitable for a specific context.

Researchers need to appreciate the characteristics of each method to be able to choose the most suitable approach (Hair et al., 2010; Hair, Ringle, & Sarstedt, 2011; Hair, Hult,

Ringle, Sarstedt, 2014). As acknowledged by Hair et al., (2011), neither method is superior to the other. They further state that “depending on the specific empirical context and objectives of a SEM study, PLS‑SEM’s distinctive methodological features make it a valuable and potentially better-suited alternative to the more popular Covariance-Based

SEM approach” (p. 149).

204

The attempt to use the conventional Covariance-Based SEM procedure in this study, using the MIMIC model to evaluate a formative-formative WAS model, failed to identify the model. When partial least squares (PLS) Structural Equation Modelling (SEM) was used instead of the conventional CB-SEM to evaluate a formative-formative WAS model, model identification was achieved. In order to evaluate the correctly specific reflective- formative WAS model, Partial Least Squares SEM was also needed.

12.2.4.1 Building a higher-order reflective-formative model of WAS using PLS-SEM.

To the researcher's best knowledge, there are only a handful of studies (Becker,

Klein, Wetzels, 2012; Wetzels, Odekerken-Schroder, van Oppen, 2009) that recommend guidelines for fitting a higher-order model in Partial Least Squares SEM. Wetzels et al.,

(2009) developed guidelines for building such a higher-order ‘reflective’ model. “PLS path modeling can also be used for higher-order models with formative constructs or a mix of formative and reflective constructs” (Wetzels et al., 2009, p. 189). In this study, mixed approaches suggested by Wetzels et al. (2009) and Becker et al. (2012) were used to fit the proposed reflective-formative WAS model, with some amendments to the guidelines as proposed by Wetzels et al. (2009) for reflective models. To clarify the approaches used in this study, a brief description of each approach including their advantages and disadvantages, is provided below.

In the reflective-formative WAS model, the first-order constructs are reflective but the higher-order constructs are formative (Figure 12.3). In the formative-formative WAS model (Figure 12.5), the first-order and second-order constructs are both formative. The repeated indicators approach and the two-stage approach are recommended to test a higher-

205

order reflective-formative model in Partial Least Squares SEM: (Becker et al., 2012; Hair,

Hult, Ringle, & Sarstedt, 2014; Wold, 1982). In the repeated indicators approach, all the indicators of the first-order constructs are allocated to the second-order construct. This is called the repeated indicators approach (Wold, 1982) because the indicator variables are repeated twice in the model (i.e., for the first and second-order constructs). The two-stage approach requires two steps in the model analysis. The first-order constructs are evaluated at the first stage and the predicted value for the first order constructs are then used in the second stage as indicators for the second-order constructs (Becker et al., 2012; Hair et al.,

2014; Wetzels et al., 2000). According to the simulation study by Becker et al., (2012) and recommendations by other researchers in this area (Ringle, Sarstedt, Straub, 2012; Hair et al., 2014), these approaches are only appropriate in specific circumstances.

The benefit of the repeated indicator procedure is the estimation of all constructs in a single analysis. However, there are some weaknesses with this approach. First, misspecifying the repeated loadings of higher-order constructs (reflective vs. formative) could lead to incorrect results. It is advised by Becker et al., (2012) that for reflective higher order models (reflective-reflective and formative-reflective models), the inner indicators of the higher-order constructs should be reflective; while for any type of higher- order formative model the repeated indicators of the higher-order constructs should be specified as formative. Another weakness of this procedure is that unequally important indicators of first-order constructs could lead to biased results (Chin et al., 2003; Ringle et al., 2012). Although, simulation studies indicate this is a concern for reflective models only, not for formative models (Becker et al., 2014). A further weakness is the production

206

of incorrectly correlated residuals due to repeated use of the same indicators for the first and second-order constructs (Becker et al., 2012). A final weakness of this procedure is that most of the variance is explained by the lower-order constructs. As a result, the path coefficients of higher-orders are usually zero or non-significant (Ringle et al., 2012).

The two-stage approach also has advantages and disadvantages. In this approach, a higher-order model is estimated separately from the first-order model, resulting in no risk of misspecification of the repeated indicators for higher-order constructs. For reflective models with unequally important indicators, this approach delivers a more reliable result compared with the repeated indicator approach (Becker et al., 2012). Most importantly, applying the two-stage approach and estimating the first-order constructs in a separate analysis of higher-order constructs allows other variables to emerge to explain some of the variances contributing to the higher-order formative constructs (Ringle et al., 2012). The disadvantage is that the first-order and higher-order constructs are not estimated simultaneously. Therefore, the model estimators might not be as precise as those obtained with the repeated indicator approach.

For this study, a reflective-formative model was fitted using a mixture of ‘repeated indicator’ and ‘two-stage approaches’. Based on the above recommendations, the repeated indicator approach was used at the first stage, with the construct scores of the first-order constructs used at the second stage as the manifests/indicators of the higher-order constructs. Applying the repeated indicator approach at two stages creates less bias, more reliable parameter estimates/scores, and a more precise estimation of path coefficients of constructs (Becker, et al., 2012).

207

Figures 12.6 to 12.8 present the PLS model building process used in this study.

Figure 12.6 . Step one: constructing the first-order sub-constructs of both personal and organisational capacities of reflective-formative model of WAS using PLS path modeling. At step one of the PLS path modeling of WAS, the first-order sub-constructs of both personal capacity (mental and physical health, leisure, work-home and home-work factors) and organisational capacity (control, trust, training, respect, support, harassment) were constructed individually (Figure 12.6). Although a small number of the constructs (e.g., physical health) had a formative structure, for consistency purposes all were considered as reflective.

208

Figure 12.7. –Step two: building the second-order formative constructs (organisational and personal capacities) for the reflective-formative model of WAS. Note. Due to limited space, only some of the indicators of organisational capacity are shown.

At step two, the second-order formative constructs (organisational and personal capacities) were built by relating them to their first-order reflective sub-constructs and the firs-order indicators. Both personal and organisational constructs were estimated separately to obtain the scores for the first-order latent factors (Figure 12.7).

209

Figure 12.8. Step three: The scores of the first-order latent factors, are used as the manifests of the second-order factors (i.e. organisational and personal capacities) and forming the higher-order construct (WAS).

At step three (Figure 12.8), the scores of the first-order latent factors were used as the manifests of the second-order factors (i.e., organisational and personal capacities) and the higher-order construct (WAS) was built by relating it to the second-order constructs

(organisational and personal capacities). The inner and outer loadings were built on repeated predictors of the first-order observed scores that were obtained at Step 1.

The third-order model was assessed in the final step (Figure 13.1). The inner and outer models of first-, second-, and third-order loadings were estimated using SmartPLS.

The inner model in SEM refers the relationships between the independent and dependent latent variables, while the outer model demonstrates the relationships between the latent

210

variables and their observed indicators. Because Partial Least Squares SEM does not require any normality assumptions for the data, parametric test results could not be used for inferential decision-making. Instead, to evaluate the significance of the coefficients, a nonparametric bootstrapping procedure was applied (Chin 1998; Efron & Tibshirani, 1993;

Tenenhaus, Vinzi, Chatelin, & Lauro, 2005) in order to draw an inferential conclusion. The number of bootstrapping subsamples needs to be higher than the number of valid observations in the original dataset (in this study, higher than 1344). As a rule, 5000 bootstrap samples are recommended in Partial Least Squares SEM (Hair et al., 2014). The number of cases used for each randomly chosen bootstrap sample is the same as the number of cases used in the analysis (1344 cases in this study).

12.2.4.2 Measurement Model Evaluation Criteria for WAS Model

Reliability, convergent and discriminant validity of the WAS. The reliability of the reflective measures at first-order was tested using model-based reliability coefficients, or what is known in Partial Least Squares SEM as composite reliability (Chin 1998; Fornell &

Larcker 1981). The conventional Coefficient alpha was compared with these values.

Composite reliability takes into account the different outer loadings of the indicator variables and therefore better reflects the reliability compared to internal consistency coefficients such as Coefficient alpha (Hair et al., 2014). Similarly to Omega, the composite reliability coefficient is defined as the ratio F/(F+E) where F is the sum of factor loadings, squared and E is the sum of the error variances. As was the case for Omega, composite reliability refers to a model-based reliability coefficient and values between

0.70-0.90 are satisfactory (Nunnally & Bernstein, 1994).

211

Internal consistency reliability coefficients such as Coefficient alpha are not appropriate for second- and third-order formative conducts. For the higher order model of

WAS, where the first-order constructs are reflective, the model-based reliability was calculated only for the first-order constructs; however, as explained by Edward (2001), reliability is not an issue for the higher order formative constructs. Instead, the validity for formative constructs is critical. According to Bollen and Lennox (1991), if the path from each subconstruct, considered as an indicator of its corresponding formative construct, is significant, then the validity of the formative construct is confirmed. The significance of all these coefficient paths demonstrates the validity of the formative model for this construct.

Another part of the validation process was to find out how distinguishable the constructs were (discriminant validity). As emphasised by Campbell and Fiske (1959, p.

84), “One cannot define without implying distinctions, and the verification of these distinctions is an important part of the validation process.” One procedure for evaluating discriminant validity requires assessing the intercorrelation of the constructs. If the square root of the average variance extracted (AVE) exceeds the estimates of the intercorrelation of a construct with the other constructs, discriminant validity is supported (Chin 1998;

Fornell & Larcker 1981).

In a previous chapter, convergent validity of a construct was defined as “the extent to which a measure correlates positively with alternative measures of the same construct”

(Hair et al., 2014, p. 102). The average variance extracted (AVE) can be used to test for convergent validity (Fornell & Larcker, 1981) with a cut-off point of greater than 0.50 required for demonstrating an acceptable convergent validity.

212

13

STUDY 3: RESULTS

In this chapter the correctly specified reflective-formative model of WAS was fitted for evaluation using the Partial Least Squares SEM procedure described in chapter 12. The results were then compared with the misspecified models of WAS to demonstrate the consequences.

13.1 Results of Model Fit Evaluation

SmartPLS was employed to estimate the inner and outer first-, second-, and third- order loadings. Tables 13.1 and 13.2 show the reliability results and convergent- discriminant validity of the constructs at the first-order of WAS. Table 13.1 demonstrates the standardised coefficients, Average Variance Extracted (AVE) for first-order constructs, the model-based reliability at construct level, and the conventional coefficient alpha reliability at item level. The model-based reliability measures for the constructs are higher than the conventional coefficient alpha. The model-based reliability of all constructs greatly exceeds the minimum acceptable level of .70, demonstrating great reliability of the constructs. The AVE of all constructs, with the exception of ‘training and harassment’, exceeds the cut-off point of .50, suggesting convergent validity in all but these two constructs. Most importantly, all the lower order loadings were significant.

Table 13.2 presents the intercorrelations of the first-order constructs along with their

Square Roots of Average Variance extracted (AVE) for assessing the discriminant validity of the constructs. The results confirm that discriminant validity of the first-order constructs

213

exists because the square root of AVE for each construct is higher than any intercorrelation with the other constructs.

214

Table 13.1 Quality Criteria of the Reflective-formative WAS First-order Constructs using PLS-SEM

Latent Variable Indicators Loadings AVE Model- Coefficient Convergent based alpha Validity reliability Reliability LEISURE Q56d 0.68 0.52 0.77 0.55 Yes Q56h 0.68 Q56i 0.78 HOME-WORK Q51c 0.92 0.82 0.90 0.78 Yes Q51d 0.89 WORK-HOME Q51a 0.94 0.88 0.94 0.86 Yes BALANCE Q51b 0.93 PHYSICAL Diagnosis 0.64 0.61 0.75 0.39 Yes HEALTH Q37 0.89 MENTAL Q47a 0.89 0.74 0.90 0.83 Yes HEALTH Q47b 0.89 Q47c 0.79 CONTROL Q15a 0.67 0.50 0.85 0.79 Yes Q15b 0.68 Q15c 0.65 Q19a 0.68 Q19b 0.72 Q19c 0.76 TRAINING Q30a 0.53 0.44 0.76 0.59 No Q30c 0.77 Q30d 0.63 Q30e 0.68 HARASSMENT Q42a 0.74 0.49 0.79 0.65 No

215

Q42c 0.63 Q42d 0.73 Q42h 0.68 SUPPORT Q27a 0.89 0.80 0.92 0.87 Yes Q27e 0.90 Q27f 0.89 RESPECT Q22c 0.89 0.81 0.93 0.89 Yes Q22d 0.93 Q22e 0.87 TRUST Q24e 0.85 0.78 0.92 0.86 Yes Q24f 0.89 Q24g 0.90

216

Table 13.2 Intercorrelation Analysis and the Square Roots of AVE of First-order Constructs of Reflective-formative PLS-SEM Model †

Discriminant LEIS HOME- WORK- PHYSICAL MENTAL TRAINI HARASS CONTROL SUPPORT RESPECT TRUST Validity URE WORK HOME HEALTH HEALTH NG MENT

LEISURE YES 0.72

HOME-WORK YES 0.11 0.91

WORK-HOME YES 0.21 0.23 0.94

PHYSICAL HEALTH YES 0.20 0.11 0.21 0.78

MENTAL HEALTH YES 0.29 0.20 0.31 0.34 0.95

CONTROL YES 0.10 -0.01 0.19 0.10 0.19 0.71

TRAINING YES 0.05 -0.07 0.01 0.01 0.09 0.21 0.67

HARASSMENT YES 0.04 0.05 0.24 0.15 0.20 0.16 0.07 0.70

SUPPORT YES 0.10 0.00 0.20 0.09 0.15 0.30 0.37 0.24 0.89

RESPECT YES 0.10 0.09 0.30 0.20 0.26 0.37 0.20 0.41 0.53 0.90

TRUST YES 0.11 0.10 0.30 0.18 0.24 0.32 0.19 0.35 0.52 0.78 0.89

† The square roots of the average variance extracted (AVE) are in bold.

217

Upon satisfying the validity and reliability of the first-order constructs, the next step involved the assessment of the validity of the second- and third-order formative constructs. As mentioned previously, the issue of reliability is meaningless for formative constructs; instead, the significance of the predictors’ paths (the path from the subcontracts to their corresponding formative construct) is important. Tables 13.3 and 13.4 present the path coefficients of subconstructs for the higher-order construct/s, confirming that all these paths are significant, and hence all these formative constructs are valid, supporting hypothesis 12.1.

Table 13.3 The Standardised Mean Coefficients of the Second-order Formative Constructs of Reflective- formative PLS-SEM Model (n=5000 bootstrap) Standardised Path T Value Support Coefficients Mean LEISURE -> PERSONAL 0.31 56.82 YES

HOME-WORK -> PERSONAL 0.34 55.95 YES

WORK-HOME -> PERSONAL 0.23 53.81 YES

PHYSICAL HEALTH -> PERSONAL 0.20 50.33 YES

MENTAL HEALTH-> PERSONAL 0.63 59.63 YES

CONTROL -> ORGANISATIONAL 0.74 64.46 YES

TRAINING -> ORGANISATIONAL 0.19 54.31 YES

HARASSMENT -> ORGANISATIONAL 0.33 58.62 YES

SUPPORT -> ORGANISATIONAL 0.34 67.59 YES

RESPECT -> ORGANISATIONAL 0.37 61.33 YES

TRUST -> ORGANISATIONAL 0.39 62.67 YES Note: p<0.05

218

Table 13.4 Results for Third-order formative Constructs of Reflective-formative WAS (n=5000 bootstrap samples) Standardised Path Coefficients T Value Support (Mean)

ORGANISATIONAL -> WAS 0.67 74.79 YES

PERSONAL -> WAS 0.50 52.48 YES

Note: p<0.05

Both Table 13.4 and Figure 13.1 demonstrate the significant path coefficients for the reflective-formative WAS model performed using bootstrapping (n=5000). The path coefficient for organisational capacities (β=0.67) and personal capacities (β=0.50) suggest that organisational capacity is a slightly stronger component of WAS than personal capacity.

219

Figure 13.1. The final model of reflective-formative WAS development using PLS path modeling.

To evaluate the next hypothesis (12.2) and to demonstrate the possible Type I and II errors resulting from measurement model misspecification, the misspecified models of WAS (i.e., reflective-reflective and formative-formative) were evaluated using PLS-SEM. In addition,

Covariance-based SEM was applied to evaluate the reflective-reflective model of WAS to establish whether the difference in model or the difference in estimation method was responsible for the differences in the results. Unfortunately, due to an identification problem, the formative- formative model of WAS could not be evaluated using the MIMIC method with Covariance- based SEM procedure. The full details of the results and the step-by step guide to evaluating the

220

misspecified models are presented in Appendix E. The next section presents the comparison of path coefficients and reliability coefficients of the misspecified models with the correctly specified model of WAS, fitted using PLS-SEM.

13.2 Comparison of the Misspecified Models with the Correctly Specified WAS Model

In this analysis, the results of all four sets of coefficients were compared - misspecified reflective-reflective using both CB-SEM and PLS-SEM, formative-formative and correctly specified reflective-formative WAS models. The full analysis and results are presented in

Appendix B. Table 13.5 presents a comparison of the path coefficients of all four analyses. The results showed that, in comparison with the correctly specified reflective-formative model, the paths of the misspecified reflective-reflective models were highly inflated , regardless of the evaluation procedure used (i.e., CB-SEM or Partial Least Squares SEM). Conversely, in the misspecified formative-formative model, the path coefficients were highly deflated, especially for the lower order construct. The results indicate that the inflated (in reflective misspecified models) and deflated (in formative misspecified model) path coefficients lead to Type I and II errors respectively.

221

Table 13.5 Comparing the Standardized Path Coefficients of Misspecified and Correctly Specified WAS Models

Correctly Misspecified models specified model 1) Reflective – 2)Reflective- 3)Formative- 4)Reflective- Constructs Reflective Reflective PLS- Formative formative CB-SEM SEM PLS-SEM PLS-SEM

LEISURE -> PERSONAL .44 .55 -0.02† 0.31

HOME-WORK -> PERSONAL .34 .46 -0.01† 0.34

WORK-HOME -> PERSONAL .57 .65 0.70 0.23

PHYSICAL HEALTH -> PERSONAL .68 .53 0.20 0.20

MENTAL HEALTH-> PERSONAL .71 .80 0.40 0.63

CONTROL -> ORGANISATIONAL .53 .58 0.24 0.74

TRAINING -> ORGANISATIONAL .32 .39 -0.15 0.19

HARASSMENT -> ORGANISATIONAL .52 .50 0.36 0.33

SUPPORT -> ORGANISATIONAL .65 .74 0.04† 0.34

RESPECT -> ORGANISATIONAL .95 .87 0.33 0.37

TRUST -> ORGANISATIONAL .92 .84 0.36 0.39

ORGANISATIONAL -> WAS .72 0.90 0.68 0.67

PERSONAL -> WAS .62 0.71 0.51 0.50 Note: †- non-significant paths.

In contrast, the model-based reliability coefficients of a misspecified reflective-reflective

CB-SEM show a downward (deflating) bias compared to the correctly specified reflective- formative WAS fitted using PLS-SEM (Table 13.6). This is primarily due to the shared measurement errors in reflective second- and third-order models. In the reflective-formative models, the first–order constructs predict the second-order construct, preventing the sharing of

222

measurement error with the reliability of the first–order loadings. As expected, the results of the reflective-reflective WAS model, evaluated with Partial Least Squares SEM, showed the same reliability coefficients as the correctly specified reflective-formative WAS model. This occurred because in Partial Least Squares SEM, the reliability coefficients of the first-order constructs were evaluated in isolation. Therefore, the misspecification of second or third-order constructs does not affect the reliability of the first-order constructs. The reliability coefficients for the misspecified formative-formative model of WAS were not calculated. As stated previously, the issue of reliability is meaningless for formative constructs where the indicators are predictors of the construct.

223

Table 13.6 Comparing the Model-based Reliability Coefficients of a Misspecified Reflective-reflective WAS (CB-SEM) with the Correctly Specified Reflective-formative Model of WAS Reflective-reflective Reflective-formative WAS (CB-SEM) WAS (PLS-SEM) Latent Variable Indicators Model-based Model-based reliability reliability LEISURE Q56d .59 0.77 Q56h Q56i HOME-WORK Q51c .80 0.90 Q51d WORK-HOME Q51a .86 0.94 BALANCE Q51b PHYSICAL HEALTH No of .42 0.75 conditions Q37 MENTAL HEALTH Q47a .83 0.90 Q47b Q47c CONTROL Q15a .89 0.85 Q15b Q15c Q19a Q19b Q19c TRAINING Q30a .59 0.76 Q30c Q30d Q30e HARASSMENT Q42a .66 0.79 Q42c Q42d Q42h SUPPORT Q27a .87 0.92 Q27e Q27f RESPECT Q22c .89 0.93 Q22d Q22e TRUST Q24e .86 0.92 Q24f Q24g

224

14

STUDY 3: DISCUSSION

This chapter provides a discussion of the results obtained in Chapter 13. While validation of reflective models is common in the literature, there is little work on the validation and assessment of model-based reliability for measurement models containing formative constructs. The purpose of Study 3 was to illustrate empirically the fitting, validation, and model-based reliability assessments of a reflective-formative model for the

Work Ability Scale (WAS), using Partial Least Squares SEM. The Work Ability Scale is misspecified in the literature as a full reflective model. The proposed reflective-formative model of WAS is a correctly specified model based on the theory described in Chapter 5 and the related decision-making tree that was developed in Chapter 5 (Figure 5.5). The secondary aims of this study were to demonstrate the likelihood of Type I and II errors occurring. This was achieved by comparing the correctly specified model with a misspecified reflective-reflective model (fitted using Partial Least Squares SEM and

Covariance-Based SEM), and a misspecified full formative-formative model (fitted using

Partial Least Squares SEM). Unfortunately, due to identification issues, evaluation of the formative-formative model using Covariance-Based SEM was not achieved, allowing only an evaluation of the formative-formative model fitted using Partial Least Squares SEM.

The proposed revised reflective-formative WAS model was based on the work of the

Redesigning Work for an Ageing Society research program at Swinburne University of

Technology (2009) and was evaluated using AMOS and SmartPLS. The three different

225

models of WAS (reflective-reflective, formative-formative and reflective-formative) were built using the PLS path modelling approach, employing the repeated indicators approach and the two-stage approach (Becker et al., 2012; Hair, Hult, Ringle, & Sarstedt, 2014;

Wold, 1982). Based on the literature, applying the repeated indicator approach at two stages creates less bias, more reliable parameter estimates/scores, and a more precise estimation of path coefficients of constructs (Becker, et al., 2012).

The results of the Partial Least Squares SEM approach to the fitting of the reflective-formative model for WAS showed that the proposed second-order WAS model in this empirical illustration contained relatively valid indicators and predictors for the WAS model. The t-statistics generated by bootstrapping also showed significant paths for both organisational and personal capacities. The correctly specified model demonstrated acceptable discriminant and convergent validity (with exceptions in the case of the training and harassment subconstructs).

The model-based reliability measures were acceptable for the first-order reflective constructs. The internal reliability coefficients alpha were clearly underestimated compared to the model-based reliability coefficients of the first-order constructs. The main reason for the underestimation of reliability by coefficient alpha is thought to be due to the assumption of essential tau-equivalence, assumed in calculating the coefficient alpha. The reported reliability in coefficient alpha is also only a lower bound for reliability and therefore results in underestimation of the true reliability (Graham, 2006).

Comparisons of the model-based reliability coefficients of the correctly specified model of WAS with its misspecified reflective-reflective model, produced thought-

226

provoking results. The model-based reliability for the misspecified model underestimated the reliability compared to the correct model. These results are important for several reasons. Comparing the correlation between the first-order constructs it seems that the misspecified model produced an overestimation of correlation among the constructs of the misspecified model compared to the correct model. The results support the literature claim stating that underestimated reliability coefficients increase the correlations between the constructs (e.g., Fan, 2003; Revelle & Zinbarg, 2008). As determined by these scholars, underestimating the reliability results in an overestimation of correlation among constructs and vice versa. Revelle and Zinbarg (2008) stress that selecting the proper reliability coefficient is important in multidimensional studies. The findings of this study add to their recommendation that researchers should also pay attention to the specification of the measurement models. Even if proper model-based reliability coefficients were used, if the model is misspecified (in terms of reflective vs formative nature) it leads to bias in the reliability estimations.

The results of the comparison of the correct and misspecified models showed inflated and deflated path coefficients when the model was misspecified as reflective or formative respectively. When comparing reflective-reflective misspecified models (fitted using Covariance-Based SEM and Partial Least Squares SEM) with the correctly specified reflective-formative model, inflated path coefficients reported in the majority of the paths presented a higher likelihood of Type I error. Based on the above comparison in terms of reliabilities this attenuation of inter-relationships is related to the lower reliabilities found in the misspecified reflective-reflective model. Nonsignificant and deflated coefficients were

227

reported for some paths when a full formative-formative model was compared with the reflective-formative model. This demonstrated the presence of Type II error (rejecting a true hypothesis). Based on the simulation study of Jarvis et al., (2003), when the structural paths originate from a misspecified construct for reflective models, there is a high possibility of inflated path estimates, resulting in the Type I error.

As mentioned in the reliability discussion, since reliability is under-estimated in reflective constructs, the path coefficients are inflated compared to the formative constructs

(Fan 2003). In both misspecified reflective-reflective models fitted using Covariance-

Based SEM and Partial Least Squares SEM, the path coefficients were therefore inflated compared to the correctly specified reflective-formative model. This is consistent with previous empirical and simulation studies (Aguirre-Urreta and Marakas 2008, 2012; Jarvis et al., 2003; Law and Wong 1999; MacKenzie et al., 2005; Petter et al., 2007). Similar to previous studies, when a formative model was misspecified as reflective, the misspecified constructs upwardly bias the coefficients of the model (Petter et al., 2011).

Any bias in a study leads to misleading conclusions and therefore it is critical to pay attention to model specification (Petter et al., 2011). It is evident that the significant level of misspecification in the area of psychology identified in Chapter 5 (18%) demonstrates the need to pay greater attention to model specification in order to achieve reliable results.

These findings are important not only for this specific example but also for future studies, opening a new area of study necessitating further research and development.

228

14.1 Implications for Work Ability Assessments

A validated scale of work ability would have many practical benefits. The work ability concept and the Work Ability Index have far-reaching and strategic benefits for work organisations, resulting in better productivity. Specific benefits of the concept include early prediction of work disability, initiation of preventive procedures, recognition of work ability status and the need for promotion (Daws, 2012; Ilmarinen, 2010).

The concept of work ability has advanced significantly from the original research on the Work Ability Index due to the multidimensional holistic view provided by the work ability model. According to scholars, work ability research in the future will include some of the following (Daws, 2012; Ilmarinen, 2010):

• utilisation of a multidimensional work ability model with a link between research and practice; • development of new work ability measures with better capacities for the identification of problems; • comprehensive evaluation of effects of interventions; • development of national and international work ability networks; • development of national surveys and the creation of datasets; • international studies of long-term effects on the Third Age (silent and boom generations) using a generational framework of analysis; • improvement of tools for training; and • development of curricula for occupational gerontology at universities.

The concept of work ability provides an all-inclusive and evidence-based concept for quality of work life as well as positive ageing. However, major attitudinal, managerial and occupational health and safety (OH&S) reforms are needed in the modern work-life environment (Ilmarinen, 2010; Taylor, Sep 2008)

229

The importance of workplace as a component of quality of life is well known.

Effective evaluation of work ability, appropriate management and supervision of workers, and the improvement of work ability and occupational well-being to achieve a win-win situation are the key ingredients. While the work ability concept is primarily concerned with the working population, it is equally important to maintain the workability of the unemployed.

Population ageing in many countries has led to concerns about labour supply, thus giving rise to an increasing emphasis on prolonging working life (Taylor, Sep 2008). The creation of a ‘golden age’ for older workers requires overcoming an early retirement mentality, changing business behaviour and attitudes among the social actors, and instituting new public policies.

In the meantime, we need reliable information based on follow-up studies or data from workplaces. We also need international comparisons of the work ability of populations and, more particularly, we need to identify the factors that maintain and promote work ability (Gould et al., 2008). Estimates of the work ability of different populations are required to support decision-making on health, work, and pension policies.

One of the critical challenges is for studies to focus on the future – “How can we find the best predictors for the development of the population’s (future) work ability?”

14.2 Limitations and Directions for Future Research

Part of the study focus was to clarify the difference between reflective and formative measurement models. The literature review has revealed that there are some serious misspecification problems in the field of organisational psychology. It seems that

230

lack of knowledge could be one of the main reasons for misspecified models. Based on the literature, a framework for identifying formative vs. reflective models has been presented in

Chapter 5 to help researchers to better identify the most appropriate type of measurement models for constructs. The proposed decision making framework is easy to understand and at the same time very comprehensive, and should therefore be of benefit to researchers.

However, some important issues regarding the identification of formative vs. reflective models still need to be resolved. In some cases, the relevance of reflective/formative constructs may differ according to the group (e.g., gender, occupation level, etc.) or situation. Further studies are required to shed more light on such specific group/situation complexities.

The difficulties encountered when fitting models for formative constructs using

Covariance-Based SEM is another hurdle in choosing formative constructs. Some of the well-known solutions include using Monte Carlo simulations and MIMIC models in which the reflective-formative models are expanded by adding reflective indicators for the higher- order latent constructs. Despite MIMIC being a suitable procedure for the identification of formative measures in most cases, this solution is criticised in the literature. With this procedure the formative ƒ construct is replaced with F (represented by another standard common factor), resulting in the deterioration of the intended meaning of the formative construct, which is formed by its antecedents (Treiblmaier, Bentler & Maira, 2011). More importantly, it is not clear how to use this method with a third-order model, such as that considered in this study. This problem is solved by using PLS-based modelling for formative constructs instead of the more popular variance-covariance-based SEM. A more

231

recent solution for the estimation of reflective-formative model is proposed by Treiblmaier,

Bentler and Maira (2011). They proposed substituting ƒ with F with minimal manipulation.

In this procedure, using canonical correlation in a two-step approach, the items belonging to each formative construct are split into two (or more) composites. The newly developed canonical constructs can then be treated as common reflective factors and can be placed into any reflective SEM model (Treiblmaier, Bentler, & Maira, 2011). However, further studies are needed to shed more light on the estimation problems encountered when fitting formative constructs.

The results of both studies (Study 1 and Study 2) showed that conventional coefficient alpha is not the best method for the estimation of internal consistency. Further studies should report model-based reliability coefficients especially for multidimensional scales like WAS.

232

15

SUMMARY

In the final chapter of this thesis, a summary of each study along with their main contributions to the literature as well as a general concluding summary will be presented.

15.1 Study 1: Model-Based Reliability, Validity and Cross Validity of the Bifactor Model

for WOAQ

In this study, attempts were made to assess the validity, cross-validity and reliability of the WOAQ in two Australian health settings, the community nursing and paramedic industries.

Based on the literature, a robust procedure of bifactor modeling was adopted for assessing the validity of the WOAQ, which was then compared with a higher order model. Cross-validity procedures, using mean and covariate structures (MACS), were adopted to evaluate the invariance across gender in regard to covariance structure and observed means. Also the means at construct level in the bifactor model were evaluated. This is a neglected area in the literature.

To estimate robust and more accurate reliability coefficients, instead of relying on the conventional internal consistency measure (Coefficient α), the model based reliability of omega coefficients were used.

In general, the results showed that the WOAQ appears to be a superior instrument for assessing risk factors due to its satisfactory psychometric properties and short length. Also it was demonstrated that a bifactor model of WOAQ fits the data better than a higher order model. A bifactor model of WOAQ provides more information not only for the general (overall) WOAQ factor but also for its nested factors and their relative importance in a given setting. In this study,

233

it was documented that a general WOAQ factor has more importance than its nested factors in a health setting. This result may be related to the differences in model structure for a nursing/paramedics setting compared to a manufacturing setting.

WOAQ was initially developed as part of a risk assessment in the manufacturing sector where direct line management is important. However, relationships with colleagues is more important in the health sector. Therefore the nested factors on management can be expected to be less important that relationships with colleagues in health settings.

The path loadings for some of the nested constructs were low indicating that a dominant proportion of the variation within each indicator is attributable to a general factor of WOAQ rather than its nested sub factor. Therefore, it is recommended that future studies should consider

WOAQ as a single general score even though it contains five nested factors. Methodologically, calculating only a separate single score for each of the nested factors does not appear to be a good choice in such settings.

This recommendation is further supported by the results of omega model-based reliability at both general level and subscales’ level. These model-based reliabilities were acceptable though demonstrating more reliability for the general factor of WOAQ than its subscales. The general factor of WOAQ attributed the largest portion of the variance compared to the nested factors, especially ‘the relationship with the management’ and ‘reward and recognition’ subscales. It is therefore suggested that for the current sample of community nursing service, it is more appropriate to use and report the general factor of WOAQ rather than its nested factors in isolation. When they were compared with the conventional coefficient alpha, the results showed overestimated reliability for coefficient alpha compared to the omega coefficients. It is therefore

234

recommend that in any future studies researchers should by default use only model-based reliability coefficients.

15.2 Study 2: Applications of Covariate-dependent Reliability

Study two presented two applications of the newly proposed covariate-dependent and covariate-free reliability approach of Bentler’s. The applications demonstrated in this study were the reliability assessment of WOAQ and the role of occupation type and also the effects of CMB on reliability. Using Covariate-dependent and covariate-free approach it was demonstrated that although WOAQ showed acceptable reliability in a nursing and paramedic organisation separately, when these samples were combined a considerable proportion of the WOAQ was attributable by the organisation type. Surprisingly the results showed that although ‘within’ organisation reliability exists for the WOAQ, ‘between’ organisation assessments failed to demonstrate a high degree of reliability between the nursing and paramedic samples. The reasons for seeing such differences in reliability was explained in terms of the differences between these organisations, their demographic characteristics, the different pace of work, different work settings and different ways of interacting with the patient and providing service delivery. Often scholars neglect to perform reliability assessments for their scales even when being used for the first time in a new setting. WOAQ is one example of many scales that are highly influenced by the type of organisation and/or the demographic characteristics of the population.

The second application of the Bentler’s covariate-dependent approach was demonstrated in the context of CMB. This new procedure was proposed for assessing the effects of CMB on the reliability of a model. . It appears that CMB has a marked effect on the reliability of the model considered in this study. This seems to be an interesting area to be explored in further research. The presence of CMB was backed up by further analysis using a confirmatory factor

235

analysis (CFA) marker approach for controlling for CMV/CMB. The results supported the presence of CMB in this application therefore supporting the influence of CMB on the reliability of the scales. This is an important finding; if scholars using covariate-dependent reliability assessment can show any CMB effect, then they must control for CMB in the rest of their analysis using a marker variable or some other approaches. Covariate-dependent reliability assessment therefore provides a new quick and easy method for testing for CMB/CMV.

Focusing on the causes and consequences of CMB using preventive procedures, as well as statistical procedures, better ways to prevent and control for the possible effects of CMV/CMB are recommended. In this study using a preventative procedure, one of the potential common method biases (social desirability) was detected and measured. Using statistical procedures, unmeasured sources of possible bias (CMV) were also controlled and evaluated.

15.3 Study 3: Model-based Reliability and Validity of Reflective-formative Model of WAS

In study 3, a comprehensive statistical and theoretical explanation of the differences between formative and reflective models as presented. Then, based on the literature, a comprehensive, simple and easy to follow decision making tree was proposed to easily distinguish reflective from formative models. The next aim was to illustrate how big the misspecification problem is in the area of organisational psychology. Although a few literature reviews have been carried out in other disciplines highlighting the misspecification rate, no study has been carried out in the psychology area. Given that scholars in the psychology discipline usually hold strong statistical knowledge, there was an expectation of a lower level of error in this area. Using this decision making tree, a broad literature review in two top journals of

Organisational Psychology were undertaken over a 9 years period (2006-2014). The two researchers found a high level of agreement (Kappa=.89) on distinguishing between misspecified

236

models using the proposed decision making tree. An 18 percent misspecification rate was found in the literature review of these two organisational psychology journal articles.

One of the main reasons for misspecification could be a lack of knowledge or problems in fitting formative models. The majority of the readily available software for SEM is designed for fitting reflective models. Therefore the main aim of study 3 was to empirically fit and evaluate the validity and model-based reliability of a mixed reflective-formative model for WAS. WAS is misspecified in the literature as a reflective model. The second aim was to compare and to demonstrate the outcomes of model misspecification and the likelihood of Type I and II errors.

As a first step, it is important to design and distinguish the structure of a measurement model before commencing the data collection. Using the literature and conceptual background of the constructs, it should be determined at the outset whether the constructs are formative or reflective. The decision flowchart was applied in the context of a work ability survey (WAS).

Using empirical data, an evaluation of the reflective-formative, formative-formative and reflective-reflective higher order models was performed. In this evaluation, two model fitting procedures (CB-SEM and PLS-SEM) were used for all models. Unfortunately, due to identification problems, the evaluation of a formative-formative model using CB-SEM was not possible. Two common procedures, repeated indicators and a two-stage path-modeling approach were used for fitting the three models using PLS-SEM with SmartPLS software. The fitted models showed major differences between the correctly specified reflective-formative model and the incorrectly specified reflective-reflective and formative-formative models. For the incorrectly specified reflective-reflective model, the structural paths were significantly inflated compared to the correctly specified reflective-formative model, suggesting in a higher probability for Type I errors. Interestingly this was more of a problem when PLS-SEM was used to fit the reflective-

237

reflective model than when CB-SEM was used. The comparison of the incorrectly specified fully formative-formative model with the correctly specified reflective-formative model showed some deflated/nonsignificant loadings. This was more evident for the lower order constructs, suggesting a higher probability for Type II error. These findings exhibited empirically the dangers of model misspecification.

It is highly recommended the scholars specify their measurement models with more caution in order to avoid Type I or II errors. The nature of constructs needs to be identified before the model fitting software is chosen. The theoretical background should always be considered as a first step to identify and conceptualise the nature of constructs (reflective vs formative or mixed).

15.4 Thesis contributions to SEM

The contribution of this thesis and directions for future research are discussed in detail at the end of each chapter. A summary of the contribution of the thesis, specifically in regard to the

SEM discipline, is presented below. The findings of the 3 studies undertaken in this project, contribute to SEM by:

- Path diagrams showing history of SEM and model based reliability. An overview

of the literature on SEM and model-based reliability in psychology was provided using

two simple, yet comprehensive path diagrams. In Chapter 2, an overview of the

development of SEM in psychology was presented using a path diagram (figure 2.1).

Similarly, in chapter 3 a history of model-based reliability was presented using a path

diagram (Figure 3.1). This diagram illustrates the history, recent developments and

current gaps in the literature, along with some justifications for carrying out the studies in

this thesis were highlighted. These diagrams can be utilized as effective training tools for 238

both Statistical and Psychology students to better understand the early roots of SEM and how more recent SEM developments relate to each other.

- Validating a bifactor model in a health setting using SEM. A comprehensive procedure for the validation of a bifactor measurement model was assessed in study 1 using SEM. Study of bifactor models and their implications is a neglected area in the literature and especially in the psychology discipline. These findings shed more light on this poorly investigated area.

- Model-based reliability of a bifactor scale. Calculating and comparing the model- based reliability coefficients of a bifactor model with the overestimated conventional coefficient alpha, demonstrated the importance of using model-based reliability for multidimensional constructs or complex scales.

- Cross-validation of a bifactor scale using latent factor means and covariance structures (MACS) procedure in SEM. In study 1, cross-validation of a bifactor measurement scale WOAQ was assessed across gender using MACS. The conventional procedure for the cross-validation of measurement models considers only covariances and observed means. Using MACS, the cross validity goes beyond observed parameter invariance assessment and looks at the mean differences at construct level. This procedure for a bifactor model is the contribution to the SEM literature that provides a more comprehensive assessment of the validity of a scale in different populations.

- Presenting an empirical application for the novel concept of covariate-dependent reliability using SEM. Two new applications of covariate-dependent reliability were introduced for the first time in study 2. Using an empirical example, it is shown how ‘type of occupation’ can affect the reliability of a scale. As such, a tool that is highly reliable in

239

one specific organisation might show very poor reliability in another organisation after controlling for cofounding variables, for example, controlling for ‘organisation type’ reduces the reliability of a scale considerably. This procedure is expected to have many implications in the SEM discipline and with issues related to model-based reliability.

- Demonstrating the novel application of covariate-dependent reliability in the evaluation of CMB using SEM. A novel approach is proposed in study 2 by drawing attention to the possible effects of CMB on the reliability of a model. In this study the covariate-dependent reliability procedure was applied to assess the effects of CMB

(measured using a social desirability scale) on the reliability of a model. The results highlight clearly how CMB can influence the reliability of scales. This is a novel area of study and will have many applications in further studies.

- Developing a flowchart for distinguishing formative versus reflective models.

Providing a simple, yet comprehensive guideline using a flowchart (Figure 5.5) for distinguishing between formative and reflective SEM measurement models was another contribution to SEM. By having a clear guideline or procedure, researchers gain better confidence in specifying the nature of new appropriate measurement models (e.g. formative models). Lack of knowledge or clear rules/principles may otherwise create a high risk of misspecification in the field.

- Demonstrating the misspecification rate of SEM measurement models (formative vs. reflective) in the Organisational Psychology literature. As mentioned in the literature review of Chapter 5, lack of knowledge is one of the reasons for the misspecification in the case of the measurement models used in the field of Organisational Psychology.

Presenting a misspecification review for a 9-year period in the Organisational Psychology

240

literature will create some awareness of the extent of the problem. The results showed an

18% misspecification rate suggesting a problem in this discipline, as is the case in some other disciplines. Misspecification of measurement models may lead to incorrect findings and the development of misleading theories, leading to false findings.

- Presenting an empirical example for fitting a reflective, a formative and a reflective-formative model using a partial least squares-based SEM approach. The majority of the SEM software on the market is built mainly for conducting CB-SEM evaluations. However, fitting a formative model of any type using CB-SEM is deemed to be difficult and usually results in identification problems. One of the solutions in such situations is the use of a PLS-SEM program to fit formative models. This is still a new area of study and as a result there is limited knowledge on fitting and evaluation approaches. In study 3, three different types of SEM measurement model were fitted and evaluated using PLS-SEM. The procedures developed in this thesis for fitting higher- order models for mixed models using PLS-SEM therefore represent a significant methodological advance.

- Empirical comparisons of correctly specified versus misspecified measurement models. The majority of the studies in the SEM discipline compare different types of measurement models using simulation studies. In study 3, empirical data was used to evaluate different types of measurement model using two common approaches (CB-SEM and PLS-SEM).

- Assessing the likelihood of Type I and Type II errors as a result of measurement model misspecification. The results of study 3 clearly showed how misspecified models can lead to inflated, deflated or non-significant results, thereby increasing the risk of Type

241

I and/or Type II errors. These findings highlight the importance of correctly specifying measurement models, thereby avoiding fundamental biases or errors. The danger of Type

I and II errors was well highlighted in study 3. This contributes to increasing awareness among scholars about the consequences of measurement model misspecification.

These findings are all based on solid SEM theory and they are illustrated with empirical analyses, which are of interest in their own right.

15.5 Summary

Overall this thesis has shown interesting applications where SEM is used for evaluating model-based reliability and validity using both CB-SEM and PLS-SEM procedures. It has also highlighted the procedures and applications of model-based reliability and validity for under-investigated measurement models, such as the bifactor and mixed reflective- formative models. In particular it has been shown how SEM makes possible the estimation of model-based reliability covariate-dependent reliability and covariate free reliability.. In addition this thesis has demonstrated the need for careful identification of the nature of constructs as formative or reflective in Organisational Psychology and the usefulness of PLS-SEM for fitting formative models. Recent SEM developments suggest that the importance of SEM will be growing in the future as its capabilities become more powerful. It is hoped that this thesis has contributed to this growth in a small way.

242

16 APPENDICES

243

PLEASE NOTE

The articles listed below are not able to be reproduced online. Please consult the print copy of this thesis held in the Swinburne library.

Karimi, L & Meyer, D 2015, ‘Validity and model-based reliability of the work organisation assessment questionnaire among nurses’, Nursing Outlook, vol. 63, no. 3, pp. 318-330, doi: 10.1016/j.outlook.2014.09.003

Karimi, L & Meyer, D 2014, ‘Structural equation modelling in psychology: the history, development and current challenges’, International Journal of Psychology Studies, vol. 6, no. 4, pp. 123-133, doi: 10.5539/ijps.v6n4p123

Karmi, L 2015 (in press), ‘Cross-validation of the work organization assessment questionnaire across gender: a study of Australia health organization’, Journal of Occupational and Environmental Medicine.

16.5 EVALUATING A HIGHER-ORDER MISSPECIFIED REFLECTIVE MODEL OF

WAS USING CB-SEM.

Using the AMOS and CB-SEM approach, the misspecified reflective models of WAS were assessed using ML estimation, assuming normally distributed data. However, Mardia’s

Multivariate Kurtosis coefficient of 120.6 suggests that normality assumptions are not supported

(DeCarlo, 1997). Therefore bootstrapping methods were used to determine bias-corrected confidence intervals for the parameter estimates. The bootstrap analysis indicated that the structural paths were significant. Based on the results from the second-order model in Figure 16.1 and according to Byrne (2009), this reflective model of WAS describes the data well (χ2/df =

2.54, CFI=.95, RMSEA=.03). The standardised path parameter estimates are presented in Figure

16.2 and Table 16.5. The loading for Organisational Capacity is clearly stronger than the loading for Personal Capacity suggesting that Organisational Capacity is a more important component of work ability than Personal Capacity in the Australian context.

283

Figure 16.1. The standardised path parameter estimates of the misspecified reflective WAS model

284

Table 16.1 The Standardised Path Parameter Estimates for the Misspecified Reflective WAS Model Using the CB-SEM Procedure

Estimate LEISURE <--- PERSONAL CAPACITY .44 HOME-WORK BALANCE <--- PERSONAL CAPACITY .34 WORK-HOME BALANCE <--- PERSONAL CAPACITY .57 PHYSICAL HEALTH <--- PERSONAL CAPACITY .68 MENTAL HEALTH <--- PERSONAL CAPACITY .71 CONTROL <--- ORG. CAPACITY .53 TRAINING <--- ORG. CAPACITY .32 HARASSMENT <--- ORG. CAPACITY .52 SUPPORT <--- ORG. CAPACITY .65 RESPECT <--- ORG. CAPACITY .95 TRUST <--- ORG. CAPACITY .92 ORG. CAPACITY <--- WORK ABILITY INDEX (WAI) .72 PERSONAL CAPACITY <--- WORK ABILITY INDEX (WAI) .62

The reliability of the reflective WAS subfactors was assessed. The results are presented in Table 16.2. In summary, the majority of the subfactors produced acceptable levels of model-based reliability (CR>0.60) (Byrne, 2009) with the exception of “leisure”

(CR=0.59), “physical health” (CR=0.42), and “training” (CR=0.59). Construct reliability will be discussed in more detail in the next chapter.

Convergent validity is defined as “the extent to which a measure correlates positively with alternative measures of the same construct” (Hair et al., 2014, p 102). A procedure to evaluate convergent validity uses the average variance extracted (AVEs) (Fornell & Larcker,

1981). A cutoff point of greater than 0.50 should be considered accounting for more than 50 per cent of variance of the indicators (Fornell & Larcker, 1981). The discriminant validity

285

WAS assessed using intercorrelation between subfactors and comparing them with the construct’s square roots of average variance extracted (AVE) (Table 16.3). If the square root of AVE was higher than the construct’s higher correlation with other constructs, then discriminant validity existed. Based on the results, discriminant validity was shown for all factors with no cross-loadings, apart from the “trust” construct. There is high cross-loading between “trust” and “respect” constructs showing lack of discriminant validity for these two subfactors.

Table 16.2 The Parameter Estimates for First-order Reflective Model Using the CB-SEM Procedure AVE CR Alpha Convergent Estimate* Cronbach validity Q56d <--- LEISURE .351 .34 .59 .79 No Q56h <--- LEISURE .586 Q56i <--- LEISURE .743 HOMEWORK_BAL .67 .80 .89 YES Q5 <--- .922 ANCE HOMEWORK_BAL Q51d <--- .700 ANCE WORKHOME_BAL .76 .86 .93 YES Q51b <--- .824 ANCE WORKHOME_BAL Q51a <--- .922 ANCE PHYSICAL_HEALT .28 .42 .67 NO Diagnosis <--- .371 H PHYSICAL_HEALT Q37 <--- .652 H Q47c <--- MENTAL_HEALTH .658 .90 .83 .91 YES Q47b <--- MENTAL_HEALTH .850 Q47a <--- MENTAL_HEALTH .850 Q19c <--- CONTROL .827 .56 .89 .86 YES Q19b <--- CONTROL .719 Q19a <--- CONTROL .710 Q15a <--- CONTROL .734 Q15b <--- CONTROL .797 Q15c <--- CONTROL .735 Q30e <--- TRAINING .545 .26 .59 .80 NO Q30d <--- TRAINING .543 Q30c <--- TRAINING .580 Q30a <--- TRAINING .381

286

Q42h <--- HARASSMENT .538 .32 .66 .83 NO Q42d <--- HARASSMENT .621 Q42c <--- HARASSMENT .493 Q42a <--- HARASSMENT .624 Q27a <--- SUPPORT .829 .69 .87 .93 YES Q27e <--- SUPPORT .851 Q27f <--- SUPPORT .827 Q22c <--- RESPECT .858 .72 .89 .94 YES Q22d <--- RESPECT .898 Q22e <--- RESPECT .804 Q24e <--- TRUST .757 .67 .86 .92 YES Q24f <--- TRUST .873 Q24g <--- TRUST .838 Note: *= all loadings are significant at P<0.05. CR=composite reliability (model-based reliability)

287

Table 16.3 Intercorrelation analysis and the square roots of AVE for all subfactors

Discriminant 1 2 3 4 5 6 7 8 9 10 11 Validity Personal Capacity Subfactor Org. Capacity Subfactors PERSONAL .45 CAPACITY 1.LEISURE YES .58 2.HOME-WORK YES .14 .82 BALANCE 3.WORK-HOME YES .25 .19 .87 BALANCE 4.PHYSICAL HEALTH YES .29 .23 .39 .53 5.MENTAL HEALTH YES .30 .24 .40 .48 .95 ORG. CAPACITY 6.CONTROL YES .10 .08 .13 .16 .16 .75 7.TRAINING YES .06 .04 .08 .09 .10 .16 .51 8.HARASSMENT YES .10 .08 .13 .16 .16 .27 .16 .57 9.SUPPORT YES .12 .09 .16 .19 .20 .33 .20 .33 .83 10.RESPECT YES .18 .14 .24 .29 .30 .50 .30 .49 .61 .85 11.TRUST NO .18 .14 .23 .28 .29 .48 .29 .47 .59 .87 .82 † The square roots of the average variance extracted (AVE) are presented in bold.

288

16.5.1 Measurement Model Evaluation Results for the PLS-SEM Misspecified

Reflective Model:

To be able to compare the path coefficients of a correctly specified reflective- formative model with a misspecified reflective model, another analysis was run using PLS-

SEM for a full reflective model (Figure 16.2). A similar procedure to the one described above was used for model construction in PLS, but modified to demonstrate a reflective model. The results are presented in the following section with Table 16.4 and Table 16.5 indicating significant paths in all cases.

Figure 16.2. The reflective model of WAS using PLS-SEM

289

Table 16.4 The Path Coefficients Results for Second-order Reflective Constructs of Reflective PLS- SEM Model (n=5000 bootstrap).

Standardised Path T Value Support* Coefficients (M) LEISURE -> PERSONAL .55 18.15 YES HOME-WORK -> .46 11.26 YES PERSONAL WORK-HOME -> .65 29.32 YES PERSONAL PHYSICAL HEALTH -> .53 17.58 YES PERSONAL MENTAL HEALTH-> .80 60.53 YES PERSONAL CONTROL -> .58 21.77 YES ORGANISATIONAL TRAINING -> .39 13.24 YES ORGANISATIONAL HARASSMENT -> .50 17.08 YES ORGANISATIONAL SUPPORT -> .74 55.38 Yes ORGANISATIONAL RESPECT -> .87 125.56 YES ORGANISATIONAL TRUST -> .84 99.62 YES ORGANISATIONAL Note: * p<0.05

290

Table 16.5 The Path Coefficinets Results for Higher-order Reflective Constructs of Reflective PLS- SEM Model (n=5000 bootstrap samples). Standardised Path T Value Support* Coefficients (M) YES ORGANISATIONAL -> WAS 0.90 145.71

YES PERSONAL -> WAS 0.71 38.36

Note: * p<0.05

As demonstrated in Figure 16.3 and Table 16.5, the WAS reflective-reflective model presents significant regression paths for higher-order constructs of the reflective

WAS. As before, the path coefficients for organisational capacities (β=0.90) represent a stronger path than those for personal capacities (β=0.71). However, both these paths are much stronger than those obtained for the reflective-formative model, suggesting that the risk of Type 1 errors has been increased as a result of the second-order model misspecification. Also these paths are much stronger than those for the reflective-reflective model fitted using CB-SEM, suggesting perhaps that models fitted using PLS-SEM are more sensitive to model misspecification than models fitted using CB-SEM

Figure 16.3. The reflective WAS development using PLS path modeling.

291

16.5.2 Measurement Model Evaluation Results for the PLS-SEM Misspecified Full

Formative Model:

To be able to compare the path coefficients of a correctly specified reflective- formative model with a misspecified formative-formative model, another analysis was carried on using PLS-SEM for a full formative model (Figure 12.5). The similar steps as reflective-formative model building were adapted with one main difference. Through the model building process for using PLS, all the indicators at first and second order and repeated measures were regarded as formative. A snapshot of the process is presented at

Figure 16.4.

292

Step1: Building the repeated measures of personal capacities construct

Step 2: Building the repeated measures of organisational capacities construct

Step 3: Building the repeated measures of WAS

Figure 16.4. The model building process for full formative model of WAS using PLS-SEM.

293

The results of path coefficients form the first order constructs are presented at

Table 16.6. As shown in the Table, all path coefficients are significant except for the two sub-constructs of personal capacities (leisure and home-work) and one sub-construct of organisational capacities (support). However, the T-Values are much smaller than was the case for the correctly specified reflective-formative model suggesting the occurrence of

Type II error as a result of the misspecification of the first order model

Table 16.6 The Path Coefficinets Results for Second-order Reflective Constructs of Full Formative PLS-SEM Model (n=5000 bootstrap). Standar dised Path T Value Support Coeffici ents (M)

LEISURE -> PERSONAL -0.02 0.40 NO†

HOME-WORK -> PERSONAL -0.01 0.16 NO†

WORK-HOME -> PERSONAL 0.70 11.94 YES PHYSICAL HEALTH -> 0.20 2.72 YES PERSONAL MENTAL HEALTH-> 0.40 5.23 YES PERSONAL CONTROL -> 0.24 3.43 YES ORGANISATIONAL TRAINING -> -0.15 2.08 YES ORGANISATIONAL HARASSMENT -> 0.36 4.97 YES ORGANISATIONAL SUPPORT -> 0.04 0.58 NO† ORGANISATIONAL RESPECT -> 0.33 2.87 YES ORGANISATIONAL

TRUST -> ORGANISATIONAL 0.36 3.12 YES

Note: † p>0.05, hence not significant.

294

Figure 16.5. The full formative model of WAS using PLS-SEM

As shown at Figure 16.5 and Table 16.7, the WAS full formative-formative model demonstrates significant regression paths for the higher-order constructs. The path coefficient for organisational capacities (β=0.68) and for personal capacities (β=0.51) are similar to those for the reflective-formative model.

Table 16.7 The Path Coefficient Results for Higher-order Reflective Constructs (n=5000 bootstrap samples). Standardised Path Coefficients T Value Support (M) YES ORGANISATIONAL -> WAS 0.68 75.29

YES PERSONAL -> WAS 0.51 50.60

Note: p<0.05

295

16.6 DEFINITIONS OF IMPORTANT TERMS

Measure. A measure in this study is defined as “a quantified record, or datum, taken as an empirical analogy to a construct” (Edwards and Bagozzi, 2000, p. 156). In this definition, measure is not a tool for data gathering, instead it is considered to be an observed score gathered through self-report, interview, observation or some other means (e.g. Messick,

1995).

Construct. A construct is a conceptual term used to describe a phenomenon of theoretical interest (Cronbach & Meehl, 1955; Nunnally, 1978 and Schwab, 1980 as cited by Edwards

& Bagozzi, 2000). In this definition construct refers to a real phenomenon (observable or unobservable) in an abstract sense. As acknowledged by Edwards and Bagozzi (2000), these phenomena involve some degree of measurement error and must be viewed through an imperfect epistemological lens.

Measurement error. Measurement error is defined “as that part of an observed variable that is not 'determined by' a construct” (Lord and Novick, 1968, p. 531). Reflective and formative models. Reflective and formative models are known also as effect and cause indicators respectively (Blalock, 1971). A reflective model considers effect indicators and a formative model considers cause indicators. Detailed characteristics of both models will be discussed in chapter two.

Covariate-dependent reliability. Bentler (in press; personal communications, 2013) defines (group) covariate-dependent reliability as “… a measure of the group differences on the trait being measured relative to total variation, while covariate-free reliability is a measure of the reliable individual difference variance freed from any mean differences due to the covariate(s)”.

296

Higher-order model. In multidimensional measurement constructs, a minimum of two levels of constructs exist: the first-order level with indicators and the second-order (higher- order) level with first-order constructs (Jarvis et al., 2003). Such models are known as hierarchical (higher-order or second-order in this example) models.

Bifactor model. Among the higher-order models, a bifactor model is defined where all latent variables are modelled as first-order constructs, in which first-order factors nested within the general factors (Gignac, 2007; Gustafsson & Balke, 1993; Holzinger &

Swineford, 1937).

Reliability rho/ ρ11 . Within the setting of model-based reliability, the analysis of congeneric measures presented by Jöreskog (1971) to calculate the reliability coefficient

ρ11 was introduced. This reliability coefficient ρ11 perhaps is one of the earliest proposals for assessing reliability of 1-factor model which ignores equal item reliabilities

(Gerbing, & Anderson, 1988).

ωt Omega total reliability coefficient. Similar to reliability rho/ ρ11 , omega total ( ) estimates the combined proportion of true score variance from the general factor and any subscales (McDonald, 1978).

Omega hierarchical reliability coefficient. Omega hierarchical ( ω h) estimates the degree of proficiency of a test measure in assessing the reliability of a hierarchical model

(Revelle, & Zinbarg, 2008; Zinbarg, Revelle, Yovel, & Li, 2005).

ω Omega subscales reliability coefficient. Omega subscale ( s ) determines the degree of reliability of the subscale scores of a bifactor model after controlling for the reliable variance generated from the general factor (Reise et al., 2012).

297

Common method bias. Common method bias (CMB) is a type of error inherent in a measure attributed to the particular method used for data collection (Bagozzi and Yi,

1990).

Common method variance. Common method variance (CMV) is a major type of systematic measurement error (Bagozzi & Yi, 1990). It represents the variance in measurement generated from the specific instrument used to collect the data (Spector,

1987).

Type I error. Type I error is a false positive error that occurs when a path is declared as significant when it is not really significant (rejecting the null hypothesis when it is true)

(Jarvis et al., 2003; MacKenzie, Podsakoff, & Jarvis, 2005).

Type II error. Type II error is a false negative by declaring a path as nonsignificant when it is really significant (failure to reject the null hypothesis when it is not true). (Jarvis et al.,

2003; MacKenzie, Podsakoff, & Jarvis, 2005). MacKenzie et al. (2005) reported that the primary cause of Type II errors is when both constructs (i.e. exogenous and endogenous) are misspecified as reflective instead of formative, resulting in a higher standard error for the parameter being reported.

Convergent validity of constructs. This type of validity evaluation refers to convergent validity where the individual loadings of each indicator on its own construct are the focus

(Mackenzie, Podsakoff & Podsakoff, 2011). Variable correlations within their related factors refer to the convergent validity of the tool. If the t-ratios for the loadings are significant and the factor loadings are above the recommended level (0.40), the convergent validity of the scale is supported (Hair et al., 2014).

Analysis of covariance (COVS) in invariance testing. This type of invariance procedure refers to comparing the covariance structure of the model parameters (e.g. factor loading,

298

measured-variable loading, variance/covariance of errors or factor residuals) across groups using analysis of covariance (COVS) (Byrne, 2009).

The invariance analysis of mean and covariance structure (MACS).This procedure refers to invariance testing of constructs means. Once the invariance of the covariance structure has been established, the invariance of construct means can be evaluated using

MACS (Cheung & Rensvold, 2002; Widaman and Reise, 1997). MACS was first introduced by Sörbom (1974) for the cross validation of SEM models.

299

16.7 THE WOAQ AND ITS SUBFACTORS ITEMS.

Item number/Factor Quality of physical environment 1 - Facilities for taking breaks 2- Work surroundings 4- Exposure to physical danger 9- Safety at work 18- The equipment/IT that you use 20- Work station/work space Quality of relationship with colleagues 10- Your relationship with your co-workers (socially) 28- How well you work with your co-workers (as a team) Quality of relationship with management 3- Clear roles and responsibilities 5- Support from line manager/supervisor 7- Feedback on your performance 11- Appreciation of efforts from line managers/supervisors 16- Senior management attitudes 17- Clear reporting line(s) 22- Communication with line manager/supervisor 26- Status/recognition in the workplace 27- Clear workplace objectives, values, procedures Reward and recognition 12- Consultation about changes in your job 13- Adequate training for your current job 14- Variety of different tasks 21- Opportunities for promotion

300

23- Opportunities for learning new skills 24- Flexibility of working hours 25- Opportunities to use your skills Workload issues 6- Pace of work 8- Your work load 15- Impact of family/social life on work 19- Impact of your work on family/social life

301

16.8 The R-WAS questionnaire

The complete R-WAS questionnaire (copied with permission from Prof Philip

Taylor, the Redesigning Work for an Ageing Society research program conducted by the

Business, Work & Ageing Centre for Research (BWA) at Swinburne University of

Technology) (2009)

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

16.9 List of items used in construction of WAS

The list of items of WAS copied with permission from the Redesigning Work for an Ageing Society research program conducted by the Business, Work and Ageing Centre for Research (BWA) at Swinburne University of Technology (2009)

323

324

325

16.10 Ethics clearance

a) Letter of approval

To: Dr Denny Meyer, FLSS/Ms Leila Karimi

[BC: Ms Leila Karimi]

Dear Dr Meyer, SUHREC Project 2011/175 The effects of common method variance on structural equation modeling

Dr Denny Meyer, FLSS/Ms Leila Karimi

Approved Duration: 22/09/2011 To 28/02/2014

I refer to the ethical review of the above project protocol undertaken on behalf of

Swinburne's Human Research Ethics Committee (SUHREC) by SUHREC Subcommittee

(SHESC2) at a meeting held on 5 September 2011. Your response to the review as e- mailed on 16 September 2011 was reviewed by a SHESC2 delegate.

I am pleased to advise that, as submitted to date, the project has approval to proceed in line with standard on-going ethics clearance conditions here outlined.

- All human research activity undertaken under Swinburne auspices must conform to

Swinburne and external regulatory standards, including the National Statement on Ethical

Conduct in Human Research and with respect to secure data use, retention and disposal.

- The named Swinburne Chief Investigator/Supervisor remains responsible for any personnel appointed to or associated with the project being made aware of ethics clearance conditions, including research and consent procedures or instruments approved. Any change in chief investigator/supervisor requires timely notification and SUHREC endorsement. 326

- The above project has been approved as submitted for ethical review by or on behalf of

SUHREC. Amendments to approved procedures or instruments ordinarily require prior ethical appraisal/ clearance. SUHREC must be notified immediately or as soon as possible thereafter of (a) any serious or unexpected adverse effects on participants and any redress measures; (b) proposed changes in protocols; and (c) unforeseen events which might affect continued ethical acceptability of the project.

- At a minimum, an annual report on the progress of the project is required as well as at the conclusion (or aboundonment) of the project.

- A duly authorised external or internal audit of the project may be undertaken at any time.

Please contact me if you have any queries about on-going ethics clearance. The SUHREC project number should be quoted in communication. Chief Investigators/Supervisors and

Student Researchers should retain a copy of this e-mail as part of project record-keeping.

Best wishes for the project.

Yours sincerely

XXXX

Secretary, SHESC2

*******************************************

327

XXXX

Administrative Officer (Research Ethics)

Swinburne Research (H68)

Swinburne University of Technology

P O Box 218

HAWTHORN VIC 3122

Tel +61 3 9214 8468

328

MEMORANDUM

RESEARCH SERVICES

To: Dr Leila Karimi, School of Public Health, Faculty of Health Sciences

From: Secretary, La Trobe University Human Ethics Committee

Subject: Review of Human Ethics Committee Application No. 11-054

Title: The effects of common method variance on structural equation modeling

Thank you for your recent correspondence in relation to the research project referred to above. The project has been assessed as complying with the National Statement on Ethical Conduct in Human Research. I am pleased to advise that your project has been granted ethics approval and you may commence the study.

The project has been approved from the date of this letter until 31 December 2012. 329

Please note that your application has been reviewed by a sub-committee of the University Human Ethics Committee (UHEC) to facilitate a decision about the study before the next Committee meeting. This decision will require ratification by the full UHEC at its next meeting and the UHEC reserves the right to alter conditions of approval or withdraw approval. You will be notified if the approval status of your project changes. The UHEC is a fully constituted Ethics Committee in accordance with the National Statement on Ethical Conduct in Research Involving Humans- March 2007 under Section 5.1.29.

The following standard conditions apply to your project:

• Limit of Approval. Approval is limited strictly to the research proposal as submitted in your application while taking into account any additional conditions advised by the UHEC.

• Variation to Project. Any subsequent variations or modifications you wish to make to your project must be formally notified to the UHEC for approval in advance of these modifications being introduced into the project. This can be done using the appropriate form: Ethics - Application for Modification to Project which is available on the Research Services website at http://www.latrobe.edu.au/research- services/ethics/HEC_human.htm. If the UHEC considers that the proposed changes are significant, you may be required to submit a new application form for approval of the revised project.

• Adverse Events. If any unforeseen or adverse events occur, including adverse effects on participants, during the course of the project which may affect the ethical acceptability of the project, the Chief Investigator must immediately notify the UHEC Secretary on telephone (03) 9479 1443. Any complaints about the project received by the researchers must also be referred immediately to the UHEC Secretary.

330

• Withdrawal of Project. If you decide to discontinue your research before its planned completion, you must advise the UHEC and clarify the circumstances.

• Annual Progress Reports. If your project continues for more than 12 months, you are required to submit an Ethics - Progress/Final Report Form annually, on or just prior to

12 February. The form is available on the Research Services website (see above address). Failure to submit a Progress Report will mean approval for this project will lapse. An audit may be conducted by the UHEC at any time.

• Final Report. A Final Report (see above address) is required within six months of the completion of the project or by 30 June 2013.

If you have any queries on the information above or require further clarification please contact me through Research Services on telephone (03) 9479-1443, or e-mail at:

[email protected].

On behalf of the University Human Ethics Committee, best wishes with your research!

XXXX

Administrative Officer (Research Ethics) University Human Ethics Committee

Research Compliance Unit / Research Services

La Trobe University Bundoora, Victoria 3086

P: (03) 9479 – 1443 / F: (03) 9479 - 1464 http://www.latrobe.edu.au/research- services/ethics/HEC_human.htm

331

332

16.11

A List of Articles Included in the Review

No Authors Title Journal / year/ issue / page John E. Mathieu and Lucy L. Gilson, Thomas Empowerment and Team Effectiveness: An Empirical Test of 1 Journal of Applied Psychology, 2006, Vol. 91, No. 1, 97–108 M. Ruddy an Integrated Model Longitudinal Examination of the Role of Goal Orientation in 2 Yaping Gong and Jinyan Fan Journal of Applied Psychology, 2006, Vol. 91, No. 1, 176–184 Cross-Cultural Adjustment Placing Perceptions of Politics in the Context of the Christopher C. Rosen, Paul E. Levy, and 3 FeedbackEnvironment, Employee Attitudes, and Job Journal of Applied Psychology, 2006, Vol. 91, No. 1, 211–220 Rosalie J. Hall Performance Bradley J. Alge, Gary A. Ballinger, Information Privacy in Organizations: Empowering Creative 4 Subrahmaniam Tangirala, and James L. Journal of Applied Psychology, 2006, Vol. 91, No. 1, 221–232 and Extrarole Performance Oakley Job Characteristics and Off-Job Activities as Predictors of Need 5 Sabine Sonnentag, Fred R. H. Zijlstra Journal of Applied Psychology, 2006, Vol. 91, No. 2, 330–350 for Recovery, Well-Being, and Fatigue Kimberly A. Eddleston, John F. Veiga and Explaining Sex Differences in Managerial Career Satisfier 6 Journal of Applied Psychology, 2006, Vol. 91, No. 2, 437–445 Gary N. Powell Preferences: The Role of Gender Self-Schema 7 Sharon K. Parker, Helen M. Williams Modeling the Antecedents of Proactive Behavior at Work Journal of Applied Psychology, 2006, Vol. 91, No. 3, 636–652 J. Craig Wallace, Eric Popp and Scott Safety Climate as a Mediator Between Foundation Climates and 8 Journal of Applied Psychology, 2006, Vol. 91, No. 3, 681–688 Mondore Occupational Accidents: A Group-Level Investigation Douglas J. Brown, Richard T. Cober, Kevin Proactive Personality and the Successful Job Search: A Field 9 Journal of Applied Psychology, 2006, Vol. 91, No. 3, 717–726 Kane, Paul E. Levy and Jarrett Shalhoop Investigation With College Graduates “All in a Day’s Work”: How Follower Individual Differences Dishan Kamdar, Daniel J. McAllister and 10 and Justice Perceptions predict OCB Role Definitions and Journal of Applied Psychology , 2006, Vol. 91, No. 4, 841–855 Daniel B. Turban Behavior Christine L. Jackson , Jason A. Colquitt, 11 Psychological Collectivism: A Measurement Validation and Michael J. Wesson and Journal of Applied Psychology , 2006, Vol. 91, No. 4, 884–899 Linkage to Group Member Performance 12 Cindy P. Zapata-PhelanWesson Debra A. Major, Jonathan E. Turner, and Linking Proactive Personality and the Big Five to Motivation to 13 Journal of Applied Psychology, 2006, Vol. 91, No. 4, 927–935 Thomas D. Fletcher Learn and Development Activity

333

Does Parental Job Insecurity Matter? Money Anxiety, Money 14 Vivien K. G. Lim and Qing Si Sng Journal of Applied Psychology, 2006, Vol. 91, No. 5, 1078–1087 Motives, and Work Motivation Perceptions of Organizational Change: A Stress and Coping 15 Alannah E. Rafferty and Mark A. Griffin Journal of Applied Psychology, 2006, Vol. 91, No. 5, 1154–1162 Perspective The Work Design Questionnaire (WDQ): Developing and 16 Frederick P. Morgeson, Stephen E. Humphrey Validating a Comprehensive Measure for Assessing Job Design Journal of Applied Psychology, 2006, Vol. 91, No. 6, 1321–1339 and the Nature of Work Laura M. Graves, Patricia J. Ohlott and Marian Commitment to Family Roles: Effects on Managers’ Attitudes 17 Journal of Applied Psychology,2007, Vol. 92, No. 1, 44–56 N. Ruderman and Performance Jonathon R. B. Halbesleben,Wm. Matthew Emotional Exhaustion and Job Performance: The Mediating 18 Journal of Applied Psychology, 2007, Vol. 92, No. 1, 93–106 Bowler Role of Motivation

19 Samuel Aryee, Zhen Xiong Chen, Li-Yun Sun, Antecedents and Outcomes of Abusive Supervision: Journal of Applied Psychology, 2007, Vol. 92, No. 1, 191–201 20 and Yaw A. Debrah, Test of a Trickle-Down Model Gilad Chen, Bradley L. Kirkman, Ruth A Multilevel Study of Leadership, Empowerment, and 21 Journal of Applied Psychology, 2007, Vol. 92, No. 2, 331–346 Kanfer, Don Allen, Benson Rosen Performance in Teams Profiling Retirees in the Retirement Transition and Adjustment 22 Mo Wang Process: Examining the Longitudinal Change Patterns of Journal of Applied Psychology, 2007, Vol. 92, No. 2, 455–474 Retirees’Psychological Well-Being Do It Right This Time: The Role of Employee Service 23 Hui Liao Recovery Performance in Customer-Perceived Justice and Journal of Applied Psychology, 2007, Vol. 92, No. 2, 475–489 Customer Loyalty After Service Failures Job Characteristics and College Performance and Attitudes: A 24 Adam B. Butler Journal of Applied Psychology, 2007, Vol. 92, No. 2, 500–510 Model of Work–School Conflict and Facilitation Jo Silvester, Fiona Patterson, Anna Koczwara, “Trust Me. . .”: Psychological and Behavioral Predictors of 25 Journal of Applied Psychology, 2007, Vol. 92, No. 2, 519–527 Eamonn Ferguson Perceived Physician Empathy Richard D. Arvey, Zhen Zhang, Bruce J. Developmental and Genetic Determinants of Leadership Role 26 Journal of Applied Psychology, 2007, Vol. 92, No. 3, 693–706 Avolio, Robert F. Krueger Occupancy Among Women Employee Self-Enhancement Motives and Job Performance 27 Seokhwa Yun, Riki Takeuchi, Wei Liu Behaviors: Investigating the Moderating Effects of Employee Journal of Applied Psychology, 2007, Vol. 92, No. 3, 745–756 Role Ambiguity and

When the Customer Shouldn’t Be King: Antecedents and 28 Hilary J. Gettman and Michele J. Gelfand Journal of Applied Psych2007, Vol. 92, No. 3, 757–770ology, Consequences of Sexual Harassment by Clients and Customers

29 James M. Diefendorff, Kajal Mehta The Relations of Motivational Traits With Workplace Deviance Journal of Applied Psychology, 2007, Vol. 92, No. 4, 967–977

334

Craig D. Crossley, Rebecca J. Bennett, Steve Development of a Global Measure of Job Embeddedness and 30 Journal of Applied Psychology2007, Vol. 92, No. 4, 1031–1042, M. Jex and Jennifer L. Burnfield, Integration Into a Traditional Model of Voluntary Turnover

Making Things Happen: Reciprocal Relationships Between 31 Michael Frese, Harry Garst, Doris Fay Work Characteristics and Personal Initiative in a Four-Wave Journal of Applied Psychology, 2007, Vol. 92, No. 4, 1084–1102 Longitudinal Structural Equation Model Abusive Supervision and Workplace Deviance and the 32 Marie S. Mitchell and Maureen L. Ambrose Journal of Applied Psychology, 2007, Vol. 92, No. 4, 1159–1168 Moderating Effects of Negative Reciprocity Beliefs Christian Vandenberghe, Kathleen Bentein, An Examination of the Role of Perceived Support and 33 Richard Michon, Jean-Charles Chebat, Michel Journal of Applied Psychology, 2007, Vol. 92, No. 4, 1177–1187 Employee Commitment in Employee–Customer Encounters Tremblay, Jean-Franc¸ois Fils Disentangling Role Perceptions: How Perceived Role Breadth, Daniel J. McAllister, Dishan Kamdar, 34 Discretion, Instrumentality, and Efficacy Relate to Helping and Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1200–1211 Elizabeth Wolfe Morrison, Daniel B. Turban Taking Charge Beyond Targets: Consequences of Vicarious Exposure to 35 Kathi Miner-Rubino, Lilia M. Cortina Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1254–1269 Misogyny at Work The Joint Effects of Personality and Workplace Social 36 Dishan Kamdar, Linn Van Dyne Exchange Relationships in Predicting Task Performance and Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1286–1298 Citizenship Performance The Role of Goal Orientation During Expatriation: A Cross- 37 Mo Wang, Riki Takeuchi Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1437–1445 Sectional and Longitudinal Investigation

38 Christine A. Sprigg, Christopher B. Stride, Work Characteristics, Musculoskeletal Disorders, and the Toby D. Wall, and David J. Holman, Phoebe Mediating Role of Psychological Strain: A Study of Call Center Journal of Applied Psychology, 2007, Vol. 92, No. 5, 1456–1466 39 R. Smith Employees Michael Frese, Stefanie I. Krauss, Nina Keith, 40 Susanne Escher, Rafal Grabarkiewicz, Siv Business Owners’ Action Planning and Its Relationship to Tonje Luneng, Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1481–1498 Business Success in Three African Countries Constanze Heers, Jens Unger, and Christian 41 Friedrich Wei-Chi Tsai, Chien-Cheng Chen, Hui-Lu Test of a Model Linking Employee Positive Moods and Task 42 Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1570–1583 Liu Performance

43 Brent A. Scott, Jason A. Colquitt and Cindy P. Justice as a Dependent Variable: Subordinate Charisma as a Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1597–1609 44 Zapata-Phelan Predictor of Interpersonal and Informational Justice Perceptions Measuring Faking in the Employment Interview: Development 45 Julia Levashina, Michael A. Campion Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1638–1656 and Validation of an Interview Faking Behavior Scale

335

46 Web-Based Recruitment: Effects of Information, Organizational David G. Allen, Raj V. Mahto, Robert F. Brand, and Attitudes Toward a Web Site on Applicant Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1696–1708 Otondo 47 Attraction

48 Zhi-Xue Zhang, Paul S. Hempel, Yu-Lan Han, Transactive Memory System Links Work Team Journal of Applied Psychology, 2007, Vol. 92, No. 6, 1722–1730 49 Dean Tjosvold Characteristics and Performance Me or We? The Role of Personality and Justice as Other- Henry Moon, Dishan Kamdar, David M. 50 Centered Antecedents to Innovative Citizenship Behaviors Journal of Applied Psychology, 2008, Vol. 93, No. 1, 84–94 Mayer, Riki Takeuchi Within Organizations Personal and Workgroup Incivility: Impact on Work and Health 51 Sandy Lim, Lilia M. Cortina, Vicki J. Magley Journal of Applied Psychology, 2008, Vol. 93, No. 1, 95–107 Outcomes Active Learning: Effects of Core Training Design Elements on 52 Bradford S. Bell, Steve W. J. Kozlowski, Journal of Applied Psychology, 2008, Vol. 93, No. 2, 296–316 Self-Regulatory Processes, Learning, and Adaptability Lillian T. Eby, Jaime R. Durley, and Sarah C. Mentors’ Perceptions of Negative Mentoring Experiences: Scale 53 Journal of Applied Psychology, 2008, Vol. 93, No. 2, 358–373 Evans, Belle Rose Ragins Development and Nomological Validation

54 James R. Detert, Linda Klebe Trevin˜o, Vicki Moral Disengagement in Ethical Decision Making: Journal of Applied Psychology, 2008, Vol. 93, No. 2, 374–391 55 L. Sweitzer A Study of Antecedents and Outcomes Severin Hornung, Denise M. Rousseau, Creating Flexible Work Arrangements Through Idiosyncratic 56 Journal of Applied Psychology, 2008, Vol. 93, No. 3, 655–664 Ju¨rgen Glaser Deals Transformational Leadership and Group Interaction as Climate 57 Dov Zohar and Orly Tenne-Gazit Journal of Applied Psychology, 2008, Vol. 93, No. 4, 744–757 Antecedents: A Social Network Analysis 58 The Relationship Between Human Resource Investments and Mahesh Subramony, Nicole Krause, Organizational Performance: A Firm-Level Examination of Journal of Applied Psychology, 2008, Vol. 93, No. 4, 778–788 Jacqueline Norton, and Gary N. Burns 59 Equilibrium Theory

60 Arnold B. Bakker, Evangelia Demerouti, How Job Demands Affect Partners’ Experience of Exhaustion: Journal of Applied Psychology, 2008, Vol. 93, No. 4, 901–911 61 Maureen F. Dollard Integrating Work–Family Conflict and Crossover Theory

Shaul Oreg, Mahmut Bayazıt, Maria Armenakis, Rasa Barkauskiene, Nikos Bozionelos, Yuka Fujimoto, Luis Gonzalez, Jian Han, Martina Hrˇebı´cˇkova, Nerina Dispositional Resistance to Change: Measurement Equivalence 62 Jimmieson, Jana Kordacova, Hitoshi Journal of Applied Psychology, 2008, Vol. 93, No. 4, 935–944 and the Link to Personal Values Across 17 Nations Mitsuhashi, Boris Mlacˇic´, Ivana Feric´, Marina Kotrla Topic, Sandra Ohly, Per Øystein Saksvik, Hilde Hetland and Ingvild Saksvik and Karen van Dam

336

Greg L. Stewart and Susan L. Dustin, Murray 63 Exploring the Handshake in Employment Interviews Journal of Applied Psychology, 2008, Vol. 93, No. 5, 1139–1146 R. Barrick, Todd C. Darnold David J. Henderson and Sandy J. Wayne, Leader–Member Exchange, Differentiation, and Psychological 64 Lynn M. Shore, William H. Bommer, Lois E. Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1208–1219 Contract Fulfillment: A Multilevel Examination Tetrick Fiona A. White, Margaret A. Charles, and The Role of Persuasive Arguments in Changing Affirmative 65 Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1271–1286 Jacqueline K. Nelson Action Attitudes and Expressed Behavior in Higher Education Getting Even for Customer Mistreatment: The Role of Moral Daniel P. Skarlicki, Danielle D. van Jaarsveld, 66 Identity in the Relationship Between Customer Interpersonal Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1335–1347 and David D. Walker Injustice and Employee Sabotage Disentangling the Effects of Promised and Delivered 67 Samantha D. Montes, P. Gregory Irving Inducements: Relational and Transactional Contract Elements Journal of Applied Psychology, 2008, Vol. 93, No. 6, 1367–1381 and the Mediating Role of Trust The Popularity Contest at Work: Who Wins, Why, and What 68 Brent A. Scott, Timothy A. Judge Journal of Applied Psychology, 2009, Vol. 94, No. 1, 20–33 Do They Receive? Managing Diversity and Enhancing Team Outcomes: The 69 Eric Kearney, Diether Gebert Journal of Applied Psychology, 2009, Vol. 94, No. 1, 77–8 Promise of Transformational Leadership

70 Hans-Georg Wolff and Klaus Moser Effects of Networking on Career Success: A Longitudinal Study Journal of Applied Psychology, 2009, Vol. 94, No. 1, 196–206

Susan M. Stewart, Mark N. Bing and H. Kristl In the Eyes of the Beholder: A Non-Self-Report Measure of 71 Davison, David J. Woehr and Michael D. Journal of Applied Psychology, 2009, Vol. 94, No. 1, 207–215 Workplace Deviance McIntyre Innovation Implementation in the Public Sector: An Integration 72 Jin Nam Choi, Jae Yoon Chang Journal of Applied Psychology, 2009, Vol. 94, No. 1, 245–253 of Institutional and Collective Dynamics Human Resources Management and Firm Performance: The Yaping Gong, Kenneth S. Law and Song 73 Differential Role of Managerial Affective and Continuance Journal of Applied Psychology, 2009, Vol. 94, No. 1, 263–275 Chang, Katherine R. Xin Commitment Peter W. Hom and Anne S. Tsui, Joshua B. Explaining Employment Relationships With Social Exchange 74 Wu, Thomas W. Lee, Ann Yan Zhang, Ping Journal of Applied Psychology, 2009, Vol. 94, No. 2, 277–297 and Job Embeddedness Ping Fu, Lan Li Tapping the Grapevine: A Closer Look at Word-of-Mouth as a 75 Greet Van Hoye and Filip Lievens Journal of Applied Psychology, 2009, Vol. 94, No. 2, 341–352 Recruitment Source Tove Helland Hammer, Mahmut Bayazit, Union Leadership and Member Attitudes: A Multi-Level 76 Journal of Applied Psychology, 2009, Vol. 94, No. 2, 392–410 David L. Wazeter Analysis

337

Testing and Extending the Group Engagement Model: Linkages 77 Steven L. Blader and Tom R. Tyler Between Social Identity, Procedural Justice, Economic Journal of Applied Psychology, 2009, Vol. 94, No. 2, 445–464 Outcomes, and Extrarole Behavior The Role of Overall Justice Judgments in Organizational Justice 78 Maureen L. Ambrose and Marshall Schminke Journal of Applied Psychology, 2009, Vol. 94, No. 2, 491–500 Research: A Test of Mediation Lei Lai, Denise M. Rousseau, Klarissa Ting 79 Idiosyncratic Deals: Coworkers as Interested Third Parties Journal of Applied Psychology, 2009, Vol. 94, No. 2, 547–556 Ting Chang Attitudinal and Motivational Antecedents of Participation in 80 Gregory M. Hurtz, Kevin J. Williams Journal of Applied Psychology, 2009, Vol. 94, No. 3, 635–653 Voluntary Employee Development Activities Chad H. Van Iddekinge, Gerald R. Ferris, Effects of Selection and Training on Unit-Level Performance 81 Alexa A. Perryman, rFred R. Blass, Thomas Journal of Applied Psychology, 2009, Vol. 94, No. 4, 829–843 Over Time: A Latent Growth Modeling Approach D. Heetderks Developing Leaders via Experience: The Role of 82 D. Scott DeRue and Ned Wellman Developmental Challenge, Learning Orientation, and Feedback Journal of Applied Psychology, 2009, Vol. 94, No. 4, 859–875 Availability Remus Ilies, Ingrid Smithey Fulmer, Matthias Personality and Citizenship Behavior: The Mediating Role of 83 Journal of Applied Psychology, 2009, Vol. 94, No. 4, 945–959 Spitzmuller, Michael D. Johnson Job Satisfaction Power to the People: Using Learner Control to Improve Trainee Karin A. Orvis, Sandra L. Fisher and Michael 84 Reactions and Learning in Web-Based Instructional Journal of Applied Psychology, 2009, Vol. 94, No. 4, 960–971 E. Wasserman Environments Looking Ahead in Times of Uncertainty: The Role of 85 Jessica B. Rodell and Jason A. Colquitt, Journal of Applied Psychology, 2009, Vol. 94, No. 4, 989–1002 Anticipatory Justice in an Organizational Change Context Fair Today, Fair Tomorrow? A Longitudinal Investigation of 86 Brian C. Holtz, Crystal M. Harold Journal of Applied Psychology, 2009, Vol. 94, No. 5, 1185–1199 Overall Justice Perceptions Employee Customer Orientation in Context: How the Jerry W. Grizzle, Alex R. Zablah, Tom J. 87 Environment Moderates the Influence of Customer Orientation Brown, and John C. Mowen, James M. Lee on Performance Outcomes David R. Hekman, H. Kevin Steensma, 88 Effects of Organizational and Professional Identification on the Gregory A. Bigley, Relationship Between Administrators’ Social Influence and Journal of Applied Psychology, 2009, Vol. 94, No. 5, 1325–1335 89 and James F. Hereford Professional Employees’ Adoption of New Work Behavior Martha C. Andrews, K. Michele Kacmar, Got Political Skill? The Impact of Justice on the Importance of 90 Journal of Applied Psychology, 2009, Vol. 94, No. 6, 1427–1437 Kenneth J. Harris Political Skill for Job Performance Abraham Carmeli and Batia Ben-Hador, David How Leaders Cultivate Social Capital and Nurture Employee 91 Journal of Applied Psychology, 2009, Vol. 94, No. 6, 1553–1561 A. Waldman, Deborah E. Rupp Vigor: Implications for Job Performance

338

Are Health and Happiness the Product of Wisdom? The Timothy A. Judge, Remus Ilies and Nikolaos 92 Relationship of General Mental Ability to Educational and Journal of Applied Psychology, 2010, Vol. 95, No. 3, 454–468 Dimotakis Occupational Attainment, Health, and Well-Being

Leigh Anne Liu, Chei Hwee Chua, Gu¨nter K. Quality of Communication Experience: Definition, 93 Journal of Applied Psychology, 2010, Vol. 95, No. 3, 469–487 Stahl Measurement, and Implications for Intercultural Negotiations

Store Manager Performance and Satisfaction: Effects on Store Richard G. Netemeyer and James G. Maxham 94 Employee Performance and Satisfaction, Store Customer Journal of Applied Psychology, 2010, Vol. 95, No. 3, 530–545 III, Donald R. Lichtenstein Satisfaction, and Store Customer Spending Growth 95 Tracy D. Hecht, Julie M. Mccarthy Coping With Employee, Family, and Student Roles: Journal of Applied Psychology, 2010, Vol. 95, No. 4, 631–647 Psychological Contract Breaches, Organizational Commitment, Thomas W. H. Ng, Daniel C. Feldman, Simon 96 and Innovation-Related Behaviors: A Latent Growth Modeling Journal of Applied Psychology, 2010, Vol. 95, No. 4, 744–751 S. K. Lam Approach

Unethical Behavior in the Name of the Company: The Elizabeth E. Umphress, John B. Bingham, 97 Moderating Effect Of Organizational Identification and Positive Journal of Applied Psychology, 2010, Vol. 95, No. 4, 769–780 Marie S. Mitchell Reciprocity Beliefs on Unethical Pro-Organizational Behavior

Robert Eisenberger, Gokhan Karagonlar, Leader–Member Exchange and Affective Organizational Florence Stinglhamber, Pedro Neves, Thomas 98 Commitment: The Contribution of Supervisor’s Organizational Journal of Applied Psychology, 2010, Vol. 95, No. 6, 1085–1103 E. Becker, M. Gloria gonzalezmeta Steiger- Embodiment Mueller-Morales,

Exploring the Dual-Level Effects of Transformational 99 Xiao-Hua (Frank) Wang, Jane M. Howell Journal of Applied Psychology, 2010, Vol. 95, No. 6, 1134–1144 LeadershipOn Followers Murray R. Barrick and Brian W. Swider, Greg Initial Evaluations in the Interview: Relationships with 100 Journal of Applied Psychology, 2010, Vol. 95, No. 6, 1163–1172 L. Stewart Subsequent Interviewer Evaluations and Employment Offers John P. Trougakos, Christine L. Jackson, Service Without a Smile: Comparing the Consequences of 101 Journal of Applied Psychology, 2010 Daniel J. Beal Neutral and Positive Display Rules Myriam N. Bechtoldt, Sonja Rohrmann, Irene The primacy of perceiving: Emotion regulation buffers negative 102 Journal of Applied Psychology, 2011, Vol. 96, No. 5, 1087-1094 E. De Pater and Bianca Beersma effects of emotional labor Creative self-efficacy development and creative performance 103 Pamela Tierney and Steven M. Farmer Journal of Applied Psychology, 2011, Vol. 96, No. 2, 277-293 over time Bradley L. Kirkman, John E. Mathieu, John L. Managing a new collaborative entity in business organizations: 104 Cordery, Benson Rosen and Michael Understanding organizational communities of practice Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1234-1245 Kukenberger effectiveness 105 Tal Yaffe and Ronit Kark Leading by example: The case of leader OCB Journal of Applied Psychology, 2011, Vol. 96, No. 4, 806-826

339

Reconsidering vocational interests for personnel selection: The Chad H. Van Iddekinge, Dan J. Putka and John 106 validity of an interst-based selection test in relation to job Journal of Applied Psychology, 2011, Vol. 96, No. 1, 13-33 P. Campbell knowledge, job performance, and continuance intentions Structural and psychological empowerment climates, J. Craig Wallace, Paul D. Johnson, Kimberly 107 performance, and the moderating role of shared felt Journal of Applied Psychology, 2011, Vol. 96, No. 4, 840-850 Mathe and Jeff Paul accountability: A managerial perspective Scott E. Seibert, Gang Wang and Stephen H. Antecedents and consequences of psychological and team 108 Journal of Applied Psychology, 2011, Vol. 96, No. 5, 981-1003 Courtright empowerment in organizations: A meta-analytic review Debra L. Shapiro, Alan D. Boss, Silvia Salas, When are transgressing leaders punitively judged? An empirical 109 Subrahmaniam Tangirala and Mary Ann Von Journal of Applied Psychology, 2011, Vol. 96, No. 2, 412-422 test Glinow

Jason D. Shaw, Jing Zhu, Michelle K. Duffy, 110 A contingency model of conflict and team effectiveness Journal of Applied Psychology, 2011, Vol. 96, No. 2, 391-400 Kristin L. Scott, His-An Shih and Ely Susanto

Costs of simultaneous coping with emotional dissonance and 111 Stefan Diestel and Klaus-Helmut Schmidt self-control demands at work: Results from two German Journal of Applied Psychology, 2011, No. 96, No. 3, 643-653 samples

Antecedents of team potency and team effectiveness: An 112 Jia Hu and Robert C. Liden Journal of Applied Psychology, 2011, Vol. 96, No. 4, 851-862 examination of goal and process clarity and servant leadership

Ronald Bledow, Antje Schmitt, Michael Frese 113 The affective shift model of work engagement Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1246-1257 and Jana Kuhnel

Gilad Chen, Payal Nangia Sharma, Suzanne K. Motivating and demotivating forces in teams: Cross-level 114 Journal of Applied Psychology, 2011, Vol. 96, No. 3, 541-557 Edinger, Debra L. Shapiro and Jiing-Lih Farh influences of empowering leadership and relationship conflict

Spencer H. Harrison, David M. Sluss, Blake E Curiosity adapted the cat: The role of trait curiosity in 115 Journal of Applied Psychology, 2011, Vol. 96, No. 1, 211-220 Ashforth newcomer adaptation John Schaubroeck , Simon S. K. Lam and Ann Cognition-based and affect-based trust as mediator of leader 116 Journal of Applied Psychology, 2011, Vol. 96, No. 4, 863-871 Chunyan Peng behaviour influences on team performance John P. Hausknecht, Michael C. Sturman and Justice as a dynamic construct: Effects of individual trajectories 117 Journal of Applied Psychology, 2011, Vol. 96, No. 4, 872-880 Quinetta M. Roberson on distal work outcomes Sven Gross, Norbert K. Semmer, Laurenz L. The effect of positive events at work on after-work fatigue: 118 Meier, Wolfgang Kalin, Nicola Jacobshagen Journal of Applied Psychology, 2011, Vol. 96, No. 3, 654-664 They matter most in face of adversity and Franziska Tschan

340

The validity and incremental validity of knowledge tests, low- 119 Filip Lievens and Fiona Patterson fidelity simulations, and high-fidelity simulations for predicting Journal of Applied Psychology, 2011, Vol. 96, No. 5, 927-940 job performance in advanced-level high-stakes selection

Maria L. Kraimer, Scott E. Seibert, Sandy J. Antecedents and outcomes of organizational support for 120 Journal of Applied Psychology, 2011, Vol. 96, No. 3, 485-500 Wayne, Robert C. Liden and Jesus Bravo development: The critical role of career opportunities Work gets unfair for the depressed: Cross-lagged relations Jessica Lang, Paul D. Bliese, Jonas W. B. 121 between organizational justice perceptions and depressive Journal of Applied Psychology, 2011, Vol. 96, No. 3, 602-618 Lang and Amy B. Adler symptoms Huy Le, In-Sue Oh, Steven B. Robbins, Remus Too much of a good thing: Curvilinear relationships between 122 Journal of Applied Psychology, 2011, Vol. 96, No. 1, 113-133 Ilies, Ed Holland and Paul Westrick personality traits and job performance The role of organizational insiders’ developmental feedback and Ning Li, T. Brad Harris, Wendy R. Boswell 123 proactive personality on newcomers’ performance: An Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1317-1327 and Zhitao Xie interactionist perspective From autonomy to creativity: A multilevel investigation of the 124 Dong Liu and Xiao-Ping Chen and Xin Yao Journal of Applied Psychology, 2011, Vol. 96, No. 2, 294-309 mediating role of harmonious passion

Motivating proteges’ personal learning in teams: A multilevel 125 Dong Liu and Ping-ping Fu Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1195-1208 investigation of autonomy support and autonomy orientation

Effect size indices for analyses of measurement equivalence: 126 Christopher D. Nye and Fritz Drasgow Understanding the practical importance of differences between Journal of Applied Psychology, 2011, Vol. 96, No. 5, 966-980 groups Dong Liu, Shu Zhang, Lei Wang and Thomas The effects of autonomy and empowerment on employee 127 Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1305-1316 W. Lee turnover: Test of a multilevel model in teams Nora Madjar, Ellen Greenberg and Zheng Factors for radical creativity, incremental creativity, and 128 Journal of Applied Psychology, 2011, Vol. 96, No. 4, 730-743 Chen routine, noncreative performance Jake G. Messersmith, Pankaj C. Patel, David Unlocking the black box: Exploring the link between high- 129 Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1105-1118 P. Lepak and Julian Gould-Williams performance work systems and performance Elizabeth Wolfe Morrison, Sara L. Wheeler- Speaking up in groups: A cross-level study of group voice 130 Journal of Applied Psychology, 2011, Vol. 96, No. 1, 183-191 Smith and Dishan Kamdar climate and voice Rating leniency and halo in multisource feedback ratings: Kok-Yee Ng, Christine Koh, Soon Ang, 131 Testing cultural assumptions of power distance and Journal of Applied Psychology, 2011, Vol. 96, No. 5, 1033-1044 Jeffrey C. Kennedy, and Kim-Yin Chan individualism- collectivism A moderated mediation model of the relationship between 132 Muammer Ozer Journal of Applied Psychology, 2011, Vol. 96, No. 6, 1328-1336 organizational citizenship behaviors and job performance

341

S. Douglas Pugh, Markus Groth and Thorsten Willing and able to fake emotions: A closer examination of the 133 Journal of Applied Psychology, 2011, Vol. 96, No. 2, 377-390 Hennig-Thurau link between emotional dissonance and employee well-being

When distress hits home: The role of contextual factors and Simon Lloyd D. Restubog, Kristin L. Scott and 134 psychological distress in predicting employees’ responses to Journal of Applied Psychology, 2011, Vol. 96, No. 4, 713-729 Thomas J. Zagenczyk abusive supervision Zhaoli Song, Maw-Der Foo, Marilyn A. Uy Unraveling the daily stress crossover between unemployed 135 Journal of Applied Psychology, 2011, Vol. 96, No. 1, 151-168 and Shuhua Sun individuals and their employed spouses Sabine Sonnentag, Eva J. Mojza, Evangelia Reciprocal relations between recovery and work engagement: 136 Journal of Applied Psychology, 2012, Vol. 97, No. 4, 842-853 Demerouti and Arnold B. Bakker The moderating role of job stressors

Andreas W. Richter, Giles Hirst, Daan van Creative self-efficacy and individual creativity in team contexts: 137 Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1282-1290 knippenberg and Markus Baer Cross-level interactions with team informational resources

Anat Rafaeli, Amir Erez, Shy Ravid, Rellie When customers exhibit verbal aggression, employees pay 138 Derfler-Rozin, Dorit Efrat Treister and Ravit Journal of Applied Psychology, 2012, Vol. 97, No. 5, 931-950 cognitive costs Scheyer Doing the right thing without being told: Joint effects of 139 Steffen Raub and hui Liao initiative climate and general self-efficacy on employee Journal of Applied Psychology, 2012, Vol. 97, No. 3, 651-667 proactive customer service performance

Steven W. Whiting, Timothy D. Maynes, Effects of message, source, and context on evaluations of 140 Journal of Applied Psychology, 2012, Vol. 97, No. 1, 159-782 Nathan P. Podsakoff and Philip M. Podsakoff employee voice behaviour

Longitudinal relationships between core self-evaluations and 141 Chia-Huei Wu and Mark A. Griffin Journal of Applied Psychology, 2012, Vol. 97, No. 2, 331-342 job satisfaction The effects of organizational and community embeddedness on 142 Thomas W. H. Ng and Daniel C. Feldman Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1233-1251 work-to-family and family-to-work conflict

Karsten Mueller, Kate Hattrup, Sven-Oliver The effects of corporate social responsibility on employees’ 143 Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1186-1200 Spiess and Nick Lin-Hi affective commitment: A cross-cultural investigation

Lisa Schurer Lambert, Bennett J Tepper, Jon Forgotten but not gone: An examination of fit between leader 144 Journal of Applied Psychology, 2012, Vol. 97, No. 5, 913-930 C. Carr, Daniel T. Holt and Alex J. Barelka consideration and initiating structure needed and received

Hannes Leroy, Bart Dierynck, Frederik Anseel, Tony Simons, Jonathon R. B. Behavioral integrity for safety, priority of safety, psychological 145 Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1273-1281 Halbesleben, Deirdre McCaughey, Grant T. safety, and patient safety: A team-level study Savage and Luc Sels

342

Jason A. Colquitt, Jeffery A. LePine, Ronald Explaining the justice-performance relationship: Trust as 146 Journal of Applied Psychology, 2012, Vol. 97, No. 1, 1-15 F. Piccolo, Cindy P. Zapata, and Bruce L. Rich exchange deepener or trust as uncertainty reducer?

When does transformational leadership enhance employee 147 Deanne N. Den Hartog and Frank D. Belschak proactive behaviour? The role of autonomy and role breadth Journal of Applied Psychology, 2012, Vol. 97, No. 1, 194-202 self-efficacy Beyond shared perceptions of trust and monitoring in teams: 148 Bart A. de Jong and Kurt T. Dirks Journal of Applied Psychology, 2012, Vol. 97, No. 2, 391-406 Implications of asymmetry and dissensus Marne L. Arthaud-Day, Joseph C. Rode and Direct and contextual effects of individual values on 149 Journal of Applied Psychology, 2012, Vol. 97, No. 4, 792-807 William H. Turnley organisational citizenship behaviour in teams Impact of high-performance work systems on individual- and Samuel Aryee, Fred O. Walumbwa, 150 branch level performance: Test of a multilevel model of Journal of Applied Psychology, 2012, Vol. 97, No. 2, 287-300 Emmanuel Y. M. Seidu and Lilian E. Otaye intermediate linkages

Richard G. Netemeyer, Carrie M. Heilman and Identification with the retail organization and customer- 151 Journal of Applied Psychology, 2012, Vol. 97, No. 5, 1049-1058 James G. Maxham, III perceived employee similarity: Effects on customer spending

Richard P. Bagozzi, Massimo Bergami, Gian Customer-organization relationships: Development and test of a 152 Journal of Applied Psychology, 2012, Vol. 97, No. 1, 63-76 Luca Marzocchi, and Gabriele Morandin theory of extended identities Uta K. Bindl, Sharon K. Parker, Peter Fuel of the self-starter: How mood relates to proactive goal 153 Journal of Applied Psychology, 2012, Vol. 97, No. 1, 134-150 Totterdell and Gareth Hagger-Johnson regulation A multilevel investigation of motivational cultural intelligence, Xiao-Ping Chen, Dong Liu and Rebecca 154 organizational diversity climate, and cultural sales: Evidence Journal of Applied Psychology, 2012, Vol. 97, No. 1, 93-106 Portnoy from U.S. real estate firms

Better understanding work unit goal orientation: Its emergence 155 Lisa Dragoni and Maribeth Kuenzi Journal of Applied Psychology, 2012, Vol. 97, No. 5, 1032-1048 and impact under different types of work unit structure

D. Scott DeRue, Jennifer D. Nahrgang, John A quasi-experimental study of after-event reviews and 156 Journal of Applied Psychology, 2012, Vol. 97, No. 5, 997-1015 R. Hollenbeck and Kristina Workman leadership development Crystal I. C. Chien Farh, Myeong-Gu Seo and Emotional intelligence, teamwork effectiveness, and job 157 Journal of Applied Psychology, 2012, Vol. 97, No. 4, 890-900 Paul E. Tesluk performance: The moderating role of job context David M. Fisher, Suzanne T. Bell, Erich C. Facet personality and surface-level diversity as team mental 158 Journal of Applied Psychology, 2012, Vol. 97, No. 4, 825-841 Dierdorff, and James A. Belohlav model antecedents: Implications for implicit coordination Innovation in globally distributed teams: The role of LMX, 159 Ravi S. Gajendran and Aparna Joshi communication frequency, and member influence on team Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1252-1261 decisions

343

Michele J. Gelfand, Lisa M. Leslie, Kirsten Conflict cultures in organizations: How leaders shape conflict 160 Journal of Applied Psychology, 2012, Vol. 97, No. 6, 1131-1147 Keller and Carsten de Dreu cultures and their organizational-level consequences The impact of help seeking on individual task performance: The 161 Dvora Geller and Peter A. Bamberger Journal of Applied Psychology, 2012, Vol. 97, No. 2, 487-497 moderating effect of help seekers’ logics of action Predicting the performance and innovativeness of scientists and 162 Robert T. Keller Journal of Applied Psychology, 2012, Vol. 97, No. 1, 225-233 engineers Karoline Strauss, Mark A. Griffin and Sharon Future work selves: How salient hoped-for identities motivate 163 Journal of Applied Psychology, 2012, Vol. 97, NO. 3, 580-598 K. Parker proactive career behaviors Job burnout and depression: Unraveling their temporal 164 Sharon Toker and Michal Biron Journal of Applied Psychology, 2012, Vol. 97, No. 3, 699-710 relationship and considering the role of physical activity Supervisors’ upward exchange relationships and subordinate 165 Le Zhou, Mo Wang, Gilad Chen and Junqi Shi outcomes: Testing the multilevel mediation role of Journal of Applied Psychology, 2012, Vol. 97, No. 3, 668-680 empowerment When my supervisor dislikes you more than me: The effect of Herman H. M. Tse, Catherine K. Lam, Sandra 166 dissimilarity in leader-member exchange on coworkers’ Journal of Applied Psychology, 2013, Vol. 98, No. 6, 974-988 A. Lawrence and Xu Huang interpersonal emotion and perceived help Subrahmaniam Tangirala, Dishan Kamdar, Doing right versus getting ahead: The effects of duty and 167 Journal of Applied Psychology, 2013, Vol. 98, No. 6, 1040-1050 Vijaya Venkataramani and Michael R. Parke achievement orientations on employees’ voice Daniel J. Beal, John P. Trougakos, Howard M. 168 Affect spin and the emotion regulation process at work Journal of Applied Psychology, 2013, Vol. 98, No. 4, 593-605 Weiss, and Reeshad S. Dalal

Junqi Shi, Russell E. Johnson, Yihao Liu and Linking subordinate political skill to supervisor dependence and 169 Journal of Applied Psychology, 2013, Vol. 98, No. 2, 374-384 Mo Wang reward recommendations: A moderated mediation model

Blaming the organization for abusive supervision: The roles of Mindy K. Shoss, Robert Eisenberger, Simon 170 perceived organizational support and supervisor’s Journal of Applied Psychology, 2013, Vol. 98, No. 1, 158-168 Lloyd D. Restubog and Thomas J. Zagenczyk organizational embodiment Aaron M. Watson, Lori Foster Thompson, When big brother is watching: Goal orientation shapes reactions 171 Jane V. Rudolph, Thomas J. Whelan, Tara S. Journal of Applied Psychology, 2013, Vol. 98, No. 4, 642-657 to electronic monitoring during online training Behrend, and Amanda L. Gissel Family-supportive organization perceptions and organizational Julie Holliday Wayne, Wendy J. Casper, 172 commitment: The mediating role of work-family conflict and Journal of Applied Psychology, 2013, Vol. 98, No. 4, 606-622 Russell A. Matthews, and Tammy D. Allen enrichment and partner attitudes

State-level goal orientations as mediators of the relationship 173 James W. Beck and Aaron M. Schmidt Journal of Applied Psychology, 2013, Vol. 98, No. 2, 354-363 between time pressure and performance: A longitudinal study

344

D. Lance Ferris, Russell E. Johnson, When is success not satisfying? Integrating regulatory focus and 174 Christopher C. Rosen, Emilija Djurdjevic, approach/avoidance motivation theories to explain the relation Journal of Applied Psychology, 2013, Vol. 98, No. 2, 342-353 Chu-Hsiang Chang and James A Tan between core-self-evaluation and job satisfaction When in doubt, seize the day? Security values, prosocial values, 175 Adam M. Grant and Nancy P. Rothbard Journal of Applied Psychology, 2013, Vol. 98, No. 5, 810-819 and proactivity under ambiguity Nina Gupta, Daniel C. Ganster and Sven 176 Assessing the validity of scales self-efficacy: A cautionary tale Journal of Applied Psychology, 2013, Vol. 98, No. 4, 690-700 Kepes

Sean T. Hannah, John M. Schaubroeck, Ann C Job influences of individual and work unit abusive supervision Peng, Robert G Lord, Linda K Trevino, Steve 177 on ethical intentions and behaviors: A moderated mediation Journal of Applied Psychology, 2013, Vol. 98, No. 4, 579-592 W. J. Kozlowski, Bruce J. Avolio, Nikolaos model Dimotakis and Joseph Doty

Daniel S. Stanhope, Samuel B. Pond III and Core self-evaluations and training effectiveness: Prediction 178 Journal of Applied Psychology, 2013, Vol. 98, No. 5, 820-831 Erica A. Surface through motivational intervening mechanisms Kristin L. Scott, Simon Lloyd D. Restubog and A social exchange-based model of the antecedents of workplace 179 Journal of Applied Psychology, 2013, Vol. 98, No. 1, 37-48 Thomas J. Zagenczyk exclusion Even the best laid plans sometimes go askew: Career self- Scott E. Seibert, Maria L. Kraimer, Brooks C. 180 management processes, career shocks, and the decision to Journal of Applied Psychology, 2013, Vol. 98, No. 1, 169-182 Holtom and Abigail J. Pierotti pursue graduate education Guo-hua Huang, Helen Hailin Zhao, Xiong- Reducing job insecurity and increasing performance ratings: 181 Journal of Applied Psychology, 2013, Vol. 98, No. 5, 852-862 ying Niu, Susan J. Ashford and Cynthia Lee Does impression management matter?

Laura Huang, Marcia Frideger and Jone L. Political skill: Explaining the effects of non-native accent on 182 Journal of Applied Psychology, 2013, Vol. 98, No. 6, 1005-1017 Pearce managerial hiring and entrepreneurial investment decisions

Michael G. Hughes, Eric Anthony Day, Xiaoqian Wang, Matthew J. Schuelke, Learner-controlled practice difficulty in the training of a 183 Journal of Applied Psychology, 2013, Vol. 98, No. 1, 80-98 Matthew L. Arsenault, Lauren N. Harkrider, complex task: Cognitive and motivational mechanisms and Olivia D. Cooper

Examining the links between employed mothers’ work 184 Ryan C. Johnson and Tammy D. Allen Journal of Applied Psychology, 2013, Vol. 98, No. 1, 148-157 characteristics, physical activity, and child health Timothy A. Judge, Jessica B. Rodell, Ryan L. Hierarchical representations of the five-factor model of 185 Klinger, Lauren S. Simon and Eean R. personality in predicting job performance: Integrating three Journal of Applied Psychology, 2013, Vol. 98, No. 6, 875-925 Crawford organizing frameworks with two theoretical perspectives Idiosyncratic deals and employee outcomes: The mediating Jun Liu, Cynthia Lee, Chun Hui, Ho Kwong 186 roles of social exchange and self-enhancement and the Journal of Applied Psychology, 2013, Vol. 98, No. 5, 832-840 Kwan and Long-Zeng Wu moderating role of individualism

345

Lisa M. Leslie, Mark Snyder and Theresa M. Who gives? Multilevel effects of gender and ethnicity on 187 Journal of Applied Psychology, 2013, Vol. 98, No. 1, 49-62 Glomb workplace charitable giving Wu Liu, Subrahmaniam Tangirala and 188 The relational antecedents of voice targeted at different leaders Journal of Applied Psychology, 2013, Vol. 98, No. 5, 841-851 Rangaraj Ramanujam

Do candidate reactions relate to job performance or affect Julie M. McCarthy, Chad H. Van Iddekinge, criterion-related validity? A multistudy investigation of 189 Filip Lievens, Mei-Chuan Kung, Evan F. Sinar Journal of Applied Psychology, 2013, Vol. 98, No. 5, 701-719 relations among reactions, selection test scores, and job and Michael A. Campion performance

Reciprocal effects of work stressors and counterproductive work 190 Laurenz L. Meier and Paul E. Spector Journal of Applied Psychology, 2013, Vol. 98, No. 3, 529-539 behaviour: A five-wave longitudinal study Nathan T. Carter, Dev K. Dalal, Anthony S. Uncovering curvilinear relationships between conscientiousness 191 Boyce, Matthew S. O’Connell, Mei-Chuan and job performance: How theoretically appropriate Journal of Applied Psychology, 2014, Vol. 99, No. 4, 564-586 Kung and Kristin M. Delgado measurement makes an empirical difference Song Chang, Liangding Jia, Riki Takeuchi and Do high-commitment work systems affect creativity? A 192 Journal of Applied Psychology, 2014, Vol. 99, No. 4, 665-680 Yahua Cai multilevel combinational approach to employee creativity

193 Jinseok S. Chun and Jin Nam Choi Members’ needs, intragroup conflict, and group performance Journal of Applied Psychology, 2014, Vol. 99, No. 3, 437-450

Stephen H. Courtright, Amy E. Colbert and Fired up or burned out? How developmental challenge 194 Journal of Applied Psychology, 2014, Vol. 99, No. 4, 681-696 Daejeong Choi differentially impacts leader behaviour

Jeroen P. de Jong, Petru L. Curseu and Roger When do bad apples not spoil the barrel? Negative relationships 195 Journal of Applied Psychology, 2014, Vol. 99, No. 3, 514-522 Th. A. J. Leenders in teams, team performance, and buffering mechanisms

Lisa Dragoni, Haeseen Park, Jim Soltis and Show and tell: How supervisors facilitate leader development 196 Journal of Applied Psychology, 2014, Vol. 99, No. 1, 66-86 Sheila Forte-Trammell among transitioning leaders Lisa Dragoni, In-Sue Oh, Paul E. Tesluk, Developing leaders’ strategic thinking through global work 197 Ozias, A. Moore, Paul VanKatwyk and Joy Journal of Applied Psychology, 2014, Vol. 99, No. 5, 867-882 experience: The moderating role of cultural distance Hazucha Beyond the individual victim: Multilevel consequences of 198 Crystal I. C. Farh and Zhijun Chen Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1074-1095 abusive supervision in teams

Distinguishing between taskwork and teamwork planning in 199 David M. Fisher Journal of Applied Psychology, 2014, Vol. 99, No. 3, 423-436 teams: Relations with coordination and interpersonal processes

A multilevel cross-cultural examination of role overload and 200 David M. Fisher organizational commitment: Investigating the interactive effects Journal of Applied Psychology, 2014, Vol. 99, No. 4, 723-736 of context

346

Erik Gonzalez-Mule, David S. DeGeest, Brian Can we get some cooperation around here? The mediating role 201 W. McCormick, Jee Young Seong and of group norms on the relationship between team personality Journal of Applied Psychology, 2014, Vol. 99, No. 5, 988-999 Kenneth G. Brown and individual helping behaviors Climate uniformity: Its influence on team communication 202 Vicente Gonzalez-Roma and Ana Hernandez Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1042-1058 quality, task conflict, and team performance Rebecca L. Greenbaum, Matthew J. Quade, When the customer is unethical: The explanatory role of 203 Mary B. Mawritz, Joongseo Kim and Durand employee emotional exhaustion onto work-family conflict, Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1188-1203 Crosby relationship conflict with coworkers, and job neglect Leading virtual teams: Hierarchical leadership, structural 204 Julia E. Hoch and Steve W. J. Kozlowki Journal of Applied Psychology, 2014, Vol. 99, No. 3, 390-403 supports, and shared team leadership Expertise dissimilarity and creativity: The contingent roles of 205 Xu Huang, JJ Po-An Hsieh, and Wei He Journal of Applied Psychology, 2014, Vol. 99, No. 5, 816-830 tacit and explicit knowledge sharing Jaclyn M. Jensen, Pankaj C. Patel and Jana L. Is it better to be average? High and low performance as 206 Journal of Applied Psychology, 2014, Vol. 99, No. 2, 296-309 Raver predictors of employee victimization Howard J. Klein, Joseph T. Cooper, Janice C. The assessment of commitment: Advantages of a 207 Journal of Applied Psychology, 2014, Vol. 99, No. 2, 222-238 Molloy and Jacqueline A. Swanson unidimensional, target-free approach How do leader-member exchange quality and differentiation 208 Alex Ning Li and Hui Liao affect performance in teams? An integrated multilevel dual Journal of Applied Psychology, 2014, Vol. 99, No. 5, 847-866 process model Wen-Dong Li, Doris Fay, Michael Frese, Peter Reciprocal relationship between proactive personality and work 209 Journal of Applied Psychology, 2014, Vol. 99, No. 5, 948-965 D. Harms and Xiang Yu Gao characteristics: A latent change score approach Blame it on the supervisor or the subordinate? Reciprocal Huiwen Lian, D. Lance Ferris, Rachel 210 relations between abusive supervision and organizational Journal of Applied Psychology, 2014, Vol. 99, No. 4, 651-664 Morrison and Douglas J. Brown deviance

Family incivility and job performance: A moderated mediation 211 Sandy Lim and Kenneth Tai Journal of Applied Psychology, 2014, Vol. 99, No. 2, 351-359 model of psychological distress and core self-evaluation

Self-regulation during job search: The opposing effects of 212 Songqi Liu, Mo Wang, Hui Liao and Junqi Shi Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1159-1172 employment self-efficacy and job search behaviour self-efficacy

Russell A. Matthews, Julie Holliday Wayne A work-family conflict/ subjective well-being process model: A 213 Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1173-1187 and Michael T. Ford test of competing theories of longitudinal effects M. Travis Maynard, Margaret M. Luciano, Modeling time-lagged reciprocal psychological empowerment- 214 Lauren D’Innocenzo, John E. Mathieu and Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1244-1253 performance relationships Matthew D. Dean

347

Speaking more broadly: An examination of the nature, 215 Timothy D. Maynes and Philip M. Podsakoff antecedents, and consequences of an expanded set of employee Journal of Applied Psychology, 2014, Vol. 99, No. 1, 87-112 voice behaviors Are we all on the same temporal page? The moderating effects 216 Susan Mohammed and Sucheta Nadkarni of temporal team cognition on the polychronicity diversity-team Journal of Applied Psychology, 2014, Vol. 99, No. 3, 404-422 performance relationship Inbal Nahum-Shani, Melanie M. Henderson, Supervisor support: Does supervisor support buffer or 217 Journal of Applied Psychology, 2014, Vol. 99, No. 3, 484-503 Sandy Lim and Amiram D. Vinokur exacerbate the adverse effects of supervisor undermining? Christopher D. Nye, Bradley J. Brummel and Understanding sexual harassment using aggregate construct 218 Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1204-1221 Fritz Drasgow models Jerel E. Slaugher, Daniel M. Cable and Daniel Changing job seekers’ image perceptions during recruitment 219 Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1146-1158 B. Turban visits: The moderating role of belief confidence Gergana Todorova, Julia B. Bear and Laurie Can conflict be energizing? A study of task conflict, positive 220 Journal of Applied Psychology, 2014, Vol. 99, No. 3, 451-467 R. Weingart emotions, and job satisfaction

Prajya R. Vidyarthi, Berrin Erdogan, Smriti One member, two leaders: Extending leader-member exchange 221 Journal of Applied Psychology, 2014, Vol. 99, No. 3, 468-483 Anand, Robert C. Liden and Anjali Chaudhry theory to a dual leadership context

Exploring the effects of individual customer incivility David D. Walker, Danielle D. van Jaarsveld 222 encounters on employee incivility: The moderating roles of Journal of Applied Psychology, 2014, Vol. 99, No. 1, 151-161 and Daniel P. Skarlicki entity (In)civility and negative affectivity

Kai Chi Yam, Ryan Fehr and Christopher M. Morning employees are perceived as better employees: 223 Journal of Applied Psychology, 2014, Vol. 99, No. 6, 1288-1299 Barnes Employees’ start times influence supervisor performance ratings

A Multilevel Integration Of Personality, Climate, Self- 224 Craig Wallace, Gilad Chen Personnel Psychology, 2006, 59, 529–557 Regulation, And Performance Relationship of personality traits and counterproductive work 225 Michael Mount, Remus Ilies, Erin Johnson Personnel Psychology, 2006, 59, 591–622 behaviors: The mediating effects of job satisfaction

An investigation of the relationship Between safety climate and 226 David a. Hofmann, barbara mark Personnel Psychology, 2006, 59, 847–869 medication Errors as well as other nurse And patient outcomes

Patrick f. Mckay, derek r. Avery, Scott Racial differences in employee retention: are diversity climate 227 Tonidandel, Mark a. Morris, Morela PERSONNEL PSYCHOLOGY, 2007, 60, 35–62 perceptions the key? Hernandez, Michelle r. Hebl Placing Peer Ratings In Context: Systematic Influences Beyond 228 Erich c. Dierdorff, eric a. Surface PERSONNEL PSYCHOLOGY, 2007, 60, 93–126 Ratee Performance

348

Family-Friendly Programs, Organizational Commitment, And 229 Peng wang, fred o. Walumbwa Work Withdrawal: The Moderating Role Of Transformational PERSONNEL PSYCHOLOGY, 2007, 60, 397–427 Leadership Fred luthans and bruce j. Avolio, james b. Positive psychological capital: Measurement and relationship 230 PERSONNEL PSYCHOLOGY, 2007, 60, 541–572 Avey, steven m. Norman, with Perfor mance and satisfaction The impact of psychological contract Breach on work-related 231 Hao zhao, sandy j. Wayne, brian c. outcomes: PERSONNEL PSYCHOLOGY, 2007, 60, 647–680 Glibkowski, jesus bravo 232 A meta-analysis Development and validation of the Five-factor model 233 Colin m. Gill, gerard p. Hodgkinson questionnaire : an adjectival-based personality inventory for use PERSONNEL PSYCHOLOGY, 2007, 60, 731–766 in occupational settings

Mel fugate, angelo j. Kinicki, gregory e. Employee coping with organizational Change: an examination 234 PERSONNEL PSYCHOLOGY, 2008, 61, 1–36 Prussia of alternative Theoretical perspectives and models

Paul j. Taylor, wen-dong li, kan shi, walter c. 235 The transportability of job information Across countries PERSONNEL PSYCHOLOGY, 2008, 61, 69–111 Borman Michael k. Mount, in-sue oh, and melanie Incremental validity of perceptual Speed and accuracy over 236 PERSONNEL PSYCHOLOGY, 2008, 61, 113–139 burns general Mental ability A meta-analysis of teamwork processes: Tests of a Jeffery a. Lepine, ronald f. Piccolo, christine l. 237 multidimensional model And relationships with team PERSONNEL PSYCHOLOGY, 2008, 61, 273–307 Jackson, john e, jessica r. Saulmathieu, Effectiveness criteria

Lisa h. Nishii, david p. Lepak, benjamin Employee attributions of the “why” Of hr practices: their effects 238 PERSONNEL PSYCHOLOGY, 2008, 61, 503–545 Schneider on Employee attitudes and behaviors, And customer satisfaction

A situational judgment test of personal Initiative and its 239 Ronald Bledow And Michael Frese PERSONNEL PSYCHOLOGY, 2009, 62, 229–258 relationship To performance The Relations Of Daily Counterproductive Workplace Behavior 240 Jixia yang, james m. Diefendorff With Emotions, Situational Antecedents, And Personality PERSONNEL PSYCHOLOGY, 2009, 62, 259–295 Moderators: A Diary Study In Hong Kong

Employee self-service technology Acceptance: a comparison Of 241 Janet h. Marler, sandra l. Fisher, weiling ke PERSONNEL PSYCHOLOGY, 2009, 62, 327–358 pre-implementation and Post-implementation relationships

Herman mark d. Mazurkiewiczaguinis, eric d. Using web-based frame-of-reference Training to decrease biases 242 PERSONNEL PSYCHOLOGY, 2009, 62, 405–438 Heggestad in Personality-based job analysis: An experimental field study

349

Chad h. Van iddekinge, gerald r. Ferris, tonia Test of a multistage model of distal And proximal antecedents 243 PERSONNEL PSYCHOLOGY, 2009, 62, 463–495 s. Heffner Of leader performance Effects of conscientiousness and Extraversion on new labor 244 Daniel b. Turban, cynthia k. Stevens market Entrants.’ job search: the mediating role of PERSONNEL PSYCHOLOGY, 2009, 62, 553–573 metacognitive activities and positive emotions Disentangling the meaning of multisource Performance rating 245 Brian j. Hoffman, david j. Woehr PERSONNEL PSYCHOLOGY, 2009, 62, 735–765 source and dimension Factors Strategic Human Resource Management In Service Context: 246 Chih-Hsun Chuang, Hui Liao Taking Care Of Business By Taking Care Of Employees And PERSONNEL PSYCHOLOGY, 2010, 63, 153–196 Customers Connie r. Wanberg, zhen zhang, erica w. Development of the “getting ready for Your next job” inventory 247 PERSONNEL PSYCHOLOGY, 2010, 63, 439–478 Diehn for unemployed Individuals Why does proactive personality predict Employee life 248 Gary j. Greguras, james m. Diefendorff satisfaction and work Behaviors? A field investigation of the PERSONNEL PSYCHOLOGY, 2010, 63, 539–560 Mediating role of the self-concordance model Managing diversity: how organizational Efforts to support Mar´Ia Del Carmen Triana, Mar´Ia Fernanda 249 diversity moderate the effects of perceived racial discrimination PERSONNEL PSYCHOLOGY, 2010, 63, 817–843 Garc´Ia, Adrienne Colella On affective commitment Dawn S. Carlson, Merideth Ferguson, Pamela The fallout from abusive supervision: An examination of 250 Personnel Psychology, 2011, 64, 937-961 L. Perrewe and Dwayne Whitten subordinates and their partners Shoshana R. Dobrow and Jennifer Tosti- 251 Calling: The development of a scale measure Personnel Psychology, 2011, 64, 1001-1049 Kharas Developing executive leaders: The relative contribution of Lisa Dragoni, In-Sue Oh, Paul Vankatwyk and 252 cognitive ability, personality, and the accumulation of work Personnel Psychology, 2011, 64, 829-864 Paul E. Tesluk experience in predicting strategic thinking competency J. Robert Baum, Barbara Jean Bird and Sheetal The practical intelligence of entrepreneurs: Antecedents and a 253 Personnel Psychology, 2011, 64, 397-425 Singh link with new venture growth Sean T. Hannah, Fred O. Walumbwa and Leadership in action teams: Team leader and members’ 254 Personnel Psychology, 2011, 64, 771-802 Louis W. Fry authenticity, authenticity strength, and team outcomes Theresa M. Glomb, Devasheesh P. Bhave, During good, feeling good: Examining the role of organizational 255 Personnel Psychology, 2011, 64, 191-223 Andrew G. Miner and Melanie Wall citizenship behaviors in changing mood Brian J. Hoffman, Klaus G. Melchers, Carrie Exercises and dimensions are the currency of assessment 256 A. Blair, Martin Kleinmann and Robert T. Personnel Psychology, 2011, 64, 351-395 centers Ladd Beyond personality traits: A study of personality states and 257 Jason L. Huang and Ann Marie Ryan Personnel Psychology, 2011, 64, 451-488 situational contingencies in customer service jobs

350

Brian K. Griepentrog, Crystal M. Harold, Integrating social identity and the theory of planned behaviour: 258 Brian C. Holtz, Richard J. Klimoski and Sean Predicting withdrawal from an organizational recruitment Personnel Psychology, 2012, 65, 723-753 M. Marsh process

Challenge-oriented organizational citizenship behaviors and Scott B. Mackenzie, Philip M. Podsakoff, 259 organizational effectiveness: Do challenge-oriented behaviors Personnel Psychology, 2011, 64, 559-592 Nathan P. Podsakoff really have an impact on the organization’s bottom line?

Leadership and employees’ reactions to change: The role of 260 Shaul Oreg and Yair Berson Personnel Psychology, 2011, 64, 627-659 leaders’ personal attributes and transformational leadership style Suzanne J. Peterson, Fred Luthans, Bruce J. Psychological capital and employee performance: A latent 261 Personnel Psychology, 2011, 64, 427-450 Avolio, Fred O. Walumbwa and Zhen Zhang growth modelling approach Intraorganizational employee navigation and socially derived 262 Christopher R. Plouffe and Yany Gregoire outcomes: Conceptualization, validation, and effects on overall Personnel Psychology, 2011, 64, 693-738 performance Status and organizational entry: How organizational and 263 John J. Sumanth and Daniel M. Cable Personnel Psychology, 2011, 64, 963-1000 individual status affect justice perceptions of hiring systems How leader-member exchange influences effective work Fred O. Walumbwa, Russell Cropanzano and 264 behaviors: Social exchange and internal-external efficacy Personnel Psychology, 2011, 64, 739-770 Barry M. Goldman perspectives Understanding newcomers’ adaptability and work-related 265 Mo Wang and Elizabeth Mccune outcomes: Testing the mediating roles of perceived P-E fit Personnel Psychology, 2011 64, 163-189 variables Riki Takeuchi, Zhijun Chen and Siu Yin Applying uncertainty management theory to employee voice 266 Personnel Psychology, 2012, 65, 283-323 Cheung behaviour: An integrative investigation

Subrahmaniam Tangirala and Rangaraj Ask and you shall hear (but not always): Examining the 267 Personnel Psychology, 2012, 65, 251-282 Ramanujam relationship between manager consultation and employee voice

Belle Rose Ragins, Jorge A. Gonzalez, Kyle Crossing the threshold: The spillover of community racial 268 Personnel Psychology, 2012, 65, 755-787 Ehrhardt and Romila Singh diversity and diversity climate to the workplace Mary Bardes Mawritz, David M. Mayer, Jenny 269 M. Hoobler, Sandy J. Wayne and Sophia V. A trickle-down model of abusive supervision Personnel Psychology, 2012, 65, 325-357 Marinova

Celia Moore, James R. Detert, Linda Klebe Why employees do bad things: Moral disengagement and 270 Personnel Psychology, 2012, 65, 1-48 Trevino, Vicki L. Baker and David M. Mayer unethical organizational behaviour

351

Brian J. Hoffman, C. Allen Gorman, Carrie A. Evidence for the effectiveness of an alternative multisource 271 Blair, John P. Meriac, Benjamin Overstreet Personnel Psychology, 2012, 65, 531-563 performance rating methodology and E. Kate Atchley

Am I the only one this supervisor is laughing at? Effects of 272 Yuanyuan Huo, Wing Lam, Ziguang Chen Personnel Psychology, 2012, 65, 859-885 aggressive humor on employee strain and addictive behaviors

Suzanne J. Peterson, Benjamin M. Galvin and CEO servant leadership: Exploring executive characteristics and 273 Personnel Psychology, 2012, 65, 565-596 Donald Lange firm performance Myeong-Gu Seo, M. Susan Taylor, N. Sharon 274 Hill, Xiaomeng Zhang, Paul E. Tesluk and The role of affect and leadership during organizational change Personnel Psychology, 2012, 65, 121-165 Natalia M. Lorinkova Doing good at work feels good at home, but not right away: 275 Sabine Sonnentag and Adam M. Grant When and why perceived prosocial impact predicts positive Personnel Psychology, 2012, 65, 495-530 affect Stanley M. Gully, Jean M. Phillips, William G. A mediated moderation model of recruiting socially and 276 Personnel Psychology, 2013, 66, 935-973 Castellano, Kyongji Han and Andrea Kim environmentally responsible job applicants Derek R. Avery, Mo Wang, Sabrina D. Different strokes for different folks: The impact of sex 277 Personnel Psychology, 2013, 66, 757-784 Volpone and Le Zhou dissimilarity in the empowerment-performance relationship Erik R. Eddy, Scott I. Tannenbaum and John Helping teams to help themselves: Comparing two team-led 278 Personnel Psychology, 2013, 66, 975-1008 E. Mathieu debriefing methods

Alicia A. Grandey, Nai-Wen Chi and Jennifer Show me the money! Do financial rewards for performance 279 Personnel Psychology, 2013, 66, 569-612 A. Diamond enhance or undermine the satisfaction from emotional labor?

Angelo J. Kinicki, Kathryn J. L. Jacobson, Development and validation of the performance management 280 Personnel Psychology, 2013, 66, 1-45 Suzanne J. Peterson and Gregory E. Prussia behaviour questionnaire Spotlight on the followers: An examination of moderators of Ning Li, Dan S. Chiaburu, Bradley L. Kirkman 281 relationships between transformational leadership and Personnel Psychology, 2013, 66, 225-260 and Zhitao Xie subordinates’ citizenship and taking charge Changes in perceived supervisor embeddedness: Effects on 282 Thomas W. H. Ng and Daniel C. Feldman employees’ embeddedness, organizational trust, and voice Personnel Psychology, 2013, 66, 645-685 behaviour Gera Noordzij, Edwin A. J. Van Hooft, Heleen The effects of a learning-goal orientation training on self- 283 Van Mierlo, Arian Van Dam and Marise Ph. Personnel Psychology, 2013, 66, 723-755 regulation: A field experiment among unemployed job seekers Born

Robert S. Rubin, Erich C. Dierdorff and Boundaries of citizenship behaviour: Curvilinearity and context 284 Personnel Psychology, 2013, 66, 377-406 Daniel G. Bachrach in the citizenship and task performance relationship

352

Applicants’ and employees’ reactions to corporate social Deborah E. Rupp, Ruodan Shao, Meghan A. 285 responsibility: The moderating effects of first-party justice Personnel Psychology, 2013, 66, 895-933 Thornton and Daniel P. Skarlicki perceptions and moral identity Daniel B. Turban, Felissa K. Lee, Serge P. Da 286 Motta Veiga, Dana L. Haggard and Sharon Y. Be happy, don’t wait: The role of trait affect in job search Personnel Psychology, 2013, 66, 483-514 Wu The invisible eye? Electronic performance monitoring and 287 Devasheesh P. Bhave Personnel Psychology, 2014, 67, 605-635 employee job performance Stephan A. Boehm, Florian Kunze and Heike Spotlight on age-diversity climate: The impact of age-inclusive 288 Personnel Psychology, 2014, 67, 667-704 Bruch HR practices on firm-level outcomes Wendy R. Boswell, Julie B. Olson-Buchanan I cannot afford to have a life: Employee adaptation to feelings 289 Personnel Psychology, 2014, 67, 887-915 and T. Brad Harris of job insecurity Amy E. Colbert, Murray R. Barrick and Bret Personality and leadership composition in top management 290 Personnel Psychology, 2014, 67, 351-387 H. Bradley teams: Implications for organizational effectiveness Contingent punishment as a double-edged sword: A dual- 291 Hong Deng and Kwok Leung Personnel Psychology, 2014, 67, 951-980 pathway model from a sense-making perspective Psychological ownership, territorial behaviour, and being Graham Brown, Craig Crossley and Sandra L. 292 perceived as a team contributor: The critical role of trust in the Personnel Psychology, 2014, 67, 463-485 Robinson work environment

T. Brad Harris, Ning Li, Wendy R. Boswell, Getting what’s new from newcomers: Empowering leadership, 293 Personnel Psychology, 2014, 67, 567-604 Xin-An Zhang and Zhitao Xie creativity, and adjustment in the socialization context

The role of leadership and trust in creating structural patterns of 294 Dong Liu, Morela Hernandez and Lei Wang Personnel Psychology, 2014, 67, 801-845 team procedural justice: A social network investigation

Jean M. Phillips, Stanley M. Gully, John E. Recruiting global travellers: The role of global travel 295 McCarthy, William G. Castellano and Mee recruitment messages and individual differences in perceived Personnel Psychology, 2014, 67, 153-201 Sook Kim fit, attraction, and job pursuit intentions Belle Rose Ragins, Karen S. Lyness, Larry J. Life spillovers: The spillover of fear of home foreclosure to the 296 Personnel Psychology, 2014, 67, 763-800 Williams and Doan Winkel workplace B. Sebastian Reiche, Pablo Cardona, Yin-teen Why do managers engage in trustworthy behaviour? A 297 Lee, Miguel Angel Canela, Esther Personnel Psychology, 2014, 67, 61-98 multilevel cross-cultural study in 18 countries Akinnukawe , et al. Hong Ren, Margaret A. Shaffer, David A. Reactive adjustment or proactive embedding? Multistudy, 298 Harrison, Carmen Fu and Katherine M. Personnel Psychology, 2014, 67, 203-239 multiwave evidence for dual pathways to expatriate retention Fodchuk

353

Service employees’ reactions to mistreatment by customers: A 299 Ruodan Shao and Daniel P. Skarlicki Personnel Psychology, 2014, 67, 23-59 comparison between North America and East Asia Jerel E. Slaughter, Michael S. Christian, On the limitations of using situational judgement tests to 300 Nathan P. Podsakoff, Evan F. Sinar and Filip measure interpersonal skills: The moderating influence of Personnel Psychology 2014, 67, 847-885 Lievens employee anger Jeffrey R. Spence, Douglas J. Brown, Lisa M. Helpful today, but not tomorrow? Feeling grateful as a predictor 301 Personnel Psychology, 2014, 67, 705-738 Keeping and Huiwen Lian of daily organizational citizenship behaviors

354

17 REFERENCES

Abraham R. (1999) Emotional intelligence in organisations: a conceptualization.

Genetic, Social and General Psychology Monographs 125, 209–227.

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & B. F. Csaki (Eds.), Second International

Symposium on Information Theory, (pp. 267-281). Academiai Kiado: Budapest.

Anderson, T. W., & Rubin, H. (1956). Statistical inference in factor analysis.

Proceedings of the Third Berkeley Symposium on Mathematical Statistics and

Probability (pp. 111-150). Berkeley: University of California Press.

Australian Government. (Compare, 2011). Work ability—Professor Juhani

Ilmarinen. Retrieved 1 March, 2013, from http://www.comcare.gov.au/news__and__media/news_listing/work_abilityprofessor_juh ani_ilmarinen

Avolio, B. J., Yammarino, F. J., & Bass, B. M. (1991). Identifying common methods variance with data collected from a single source: An unresolved sticky issue.

Journal of management, 17, 571-586.

Bagozzi, R. P. (1977). Structural equation models in experimental research.

Journal of Marketing Research, 14, 209-226.

Bagozzi, R. P. (1980). Causal Modeling in Marketing, Wiley & Sons,. New

York, NY.

355

Bagozzi, R. P., & Yi, Y. (1990). Assessing method variance in multitrait- multimethod matrices: The case of self-reported affect and perceptions at work. Journal of Applied Psychology, 75(5), 547-560.

Bagozzi, R. P., & Yi, Y. (1991). Multitrait-Multimethod matrices in consumer research. Journal of Consumer Research, 17(4), 426-439.

Barclay, D., Higgins, C. and Thompson, R. (1995). The Partial Least Squares

(PLS) Approach to Causal Modeling: Personal Computer Adoption and Use an

Illustration, Technology Studies, 2, 285-309.

Becker, J.M., Klein, K., Wetzels, M., (2012). Hierarchical latent variable models in PLS-SEM: guidelines for using reflective-formative type models, Long Range

Planning 45 (6), 359-394.

Bentler, P. M. (1968). Alpha-maximized factor analysis (Alphamax): Its relation to alpha and canonical factor analysis. Psychometrika, 33, 335-345.

Bentler, P. M. (1972). A lower-bound method for the dimension-free measurement of internal consistency. Social Science Research, 1, 343-357.

Bentler, P. M. (1986). Structural modeling and psychometrika: An historical perspective on growth and achievements. Psychometrika, 51(1), 35-51

Bentler, P. M. (2006). EQS 6 structural equations program manual. Encino, CA:

Multivariate Software (www.mvsoft.com). Los Angeles.

356

Bentler, P. M. (2007). Covariance structure models for maximal reliability of unit-weighted composites. In S. –Y. Lee (Ed.), Handbook of latent variable and related models (pp. 1-19). Amsterdam: North-Holland.

Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal consistency reliability. Psychometrika, 74, 137-143.

Bentler, P. M. (2014). Covariate-free and Covariate-dependent Reliability. The

79th Annual Meeting of the Psychometric Society. Madison, Wisconsin. July 22-25.

Blakeley, J. A. & Ribeiro, V. E. S. (2008). Early retirement among registered nurses: Contributing factors. Journal of Nursing Management, 16, 29-37.

Blalock, H. M. (1971). Causal models in the social sciences. Chicago: Aldine-

Atherton.

Bock, R. D., and Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459.

Bollen, K. A. (1984). Multiple indicators: Internal consistency or no necessary relationship? Quality and Quantity, 18, 377-385.

Bollen, K. A. (1989). Structural Equations with Latent Variables, New York:

John Wiley & Sons.

Bollen, K., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110, 305-314.

357

Boumans, N. P. G., de Jong, A. H. J., & Vanderlinden, L. (2008). Determinants of early retirement intentions among Belgian nurses. Journal of Advanced Nursing,

63,(1), 64-74.

Brooke, E., Goodall, J., & Mawren, D. (2010). Retaining older workforces in aged care work. Paper presented at the 4th International Symposium on Work Ability:

Age Management during the Life Course, pp. 187-197, Tampere, Finland.

Browne, M. W. (1968). A comparison of factor analytic techniques.

Psychometrika, 33, 267-334.

Buck, R., Varnava, A., Wynne-Jones, G., Phillips. C., Farewell, D., Porteous, C.,

Webb, K., Buttton, L., Cooper, L., & Main, C. (2008). Health and well-being in work in

Merthyr Tydfill: A biopsychosocial approach. Well-being in Work Stage 2: Final Report to the Wales Centre for Health and Welsh Assembly Government.

www.wellbeinginwork.org

Burnham, K. P. & Anderson, D. R. (2004). Multimodel Inference Understanding

AIC and BIC in Model Selection. Sociological Methods & Research, 33 (2).

BWA Centre for Research. (2007a). The redesigning work for an ageing society project: Fact Sheet 1. Retrieved 2 March, 2013, from http://www.swinburne.edu.au/business/business-work- ageing/documents/ARC_FactSheet1_16Mar07.pdf

358

BWA Centre for Research. (2007b). What is work ability?: Fact Sheet 2.

Retrieved 2 March, 2013, from http://www.swinburne.edu.au/business/business-work- ageing/documents/ARC_FactSheet2_10Sep07.pdf

Byrne, B. M. (2006). Structural equation modeling with EQS: Basic concepts, application, and programming. New Jersey: Lawrence Elbaum Associates.

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105.

Carroll, R. J., Ruppert, D., Stefanski, L. A. and Crainiceanu, C. M. (2006).

Measurement Error in Nonlinear Models: A Modern Perspective, 2nd ed. Chapman and

Hall/CRC Press: Boca Raton.

Cattell, R. B. (1978). The scientific use of factor analysis in behavioural and life sciences. New York: Plenum.

Cheung, G. W. & Rensvold, R. B. (2002): Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance, Structural Equation Modeling: A Multidisciplinary

Journal, 9(2), 233-255.

Cheung, M. W. (2008). A Model for Integrating Fixed-, Random-, and Mixed-

Effects Meta-Analyses Into Structural Equation Modeling. Psychological Methods,

13(3), 182–202.

Chin, W. W., & Newsted, P. R. (1999). Structural equation modeling analysis with small samples using partial least squares. In Hoyle, R. (Ed.), Statistical strategies for small samples research (pp. 307–341). Thousand Oaks, CA: Sage.

359

Chin, W.W. (1998). The partial least squares approach to structural equation modeling. In: Marcoulides, G.A. (Ed.), Modern Methods for Business Research.

Erlbaum, Mahwah, pp. 295e358.

Chin, W.W., Marcolin B.L., & Newsted, P.R. (2003) A partial least squares latent variable modeling approach for measuring interaction effects. Results from a

Monte Carlo simulation study and an electronic-mail emotion/adoption study. Inf Syst

Res 14(2):189–217

Chin, W. W. (2010). How to write up and report PLS analyses. In Esposito, V., et al. (eds.),. Handbook of Partial Least Squares, pp 655-690.

Ciarrochi J., Chan A.Y.C. & Caputi P. (2000) A critical evaluation of the emotional intelligence concept. Personality and Individual Differences 28, 1477–1490.

Copertano, A., Bevilacqua, G., Barbaresi, M., Barchiesi, F., & Copertano, B.

(2010). Work-related stress: Risk assessment in the local regional health service unit of

Ancona [La valutazione dello stress lavoro-correlato nell’azienda sanitaria di Ancona].

Giornale Italiano di Medicina del Lavoro ed Ergonomia, 29(4), 128-129.

Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Psychological Bulletin, 78, 98-104.

Costner, H. L. (1971). Utilizing causal models to discover flaws in experiments.

Sociometry, 34, 398-410.

Cox T, Thirlaway M, Gotts G, Cox S. (1983). The nature and assessment of general wellbeing. Journal of Psychosomatic Research, 27, 353-359.

360

Cox T. (1997). Workplace health promotion. Work & Stress, 11, 1-5.

Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart, and Winston.

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests.

Psychometrika, 16, 297-334.

Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302.

Crossley, C. D., Bennett, R. J., Jex S. M. , & Burnfield, J. L. (2007).

Development of a global measure of job embeddedness and integration into a traditional model of voluntary turnover. Journal of Applied Psychology. 92(4),1031-1142.

Cunningham, E. (2008). A practical guide to Structural Equation Modeling using

AMOS. Melbourne: Statsline.Daws, J. (2012). Finnish history of work ability research and age management Retrieved 6 March, 2013, from http://www.ngssuper.com.au/assets/Images/Supermembers/NGS-SA-2011-12-

WinnerJimDaws-1102-1012.pdf

D’Errico, A., Viotti, S., Baratti, A., Mottura, B., Barocelli, A.P., Tagna, M.,

Sgambelluri, B., Battaglino, P., & Converso, D. (2013). Low back pain and associated presenteeism among hospital nursing staff. Journal of Occupational Health, 55, 276-

283.

361

de Zwart, B. C., Frings-Dresen, M. H., & van Duivenbooden, J. C. (2002). Test– retest reliability of the Work Ability Index questionnaire. Occupational Medicine, 52(4),

177-181.

DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological

Methods, 2, 292-307.

DeMars, C. (2010). Item Response Theory. New York: Oxford University Press.

Diamantopoulos, A, & Winklhofer H. (2001). Index Construction with Formative

Indicators: An Alternative to Scale Development. Journal of Marketing Research, 38

(2), 269-277.

Diamantopoulos, A. (2010). Reflective and Formative Metrics of Relationship

Value: Response to Baxter’s Commentary Essay, Journal of Business Research, 63(1),

91-93.

Diamantopoulos, A. and Winklhofer, H. M. (2001), "Index Construction with

Formative Indicators: An Alternative to Scale Development." Journal of Marketing

Research, 38, 269-277.

Diamantopoulos, A., Riefler, P., and Roth, K. P. (2008). Advancing Formative

Measurement Models, Journal of Business Research, 61(12), pp. 1203-1218.

Donaldson, S. I., & Grant-Vallone, E. J. (2002). Understanding self-report bias in organizational behavior research. Journal of Business and Psychology, 17(2), 245-260.

Doty, D. H., & Glick, W. H. (1998). Common method bias: Does common methods variance really bias results? Organizational Research Methods, 1(4), 374-406.

362

Edwards, J. R., and Bagozzi, R. P. (2000). On the Nature and Direction of

Relationships Between Constructs and Measures, Psychological Methods (95:2), 155-74.

Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap.

Monographs on Statistics and Applied Probability, no. 57. New York, NY: Chapman and Hall.

Eskelinen, L., Kohvakka, A., Merisalo, T., Hurri, H., & Wagar, G. (1991).

Relationship between the self-assessment and clinical assessment of health status and work ability. Scand J Work Environ Health, 17(Suppl 1), 40-47.

European Network for Workplace Health Promotion (ENWHP) & National

WORK ABILITY INDEX (WAI) Network. (2012). Work Ability Index - Europe.

Retrieved 27 Feb, 2013, from http://www.thcu.ca/workplace/sat/pubs/tool_159.pdf

Faragher, E. B., Cooper CL, & Cartwright S. (2004). A shortened stress evaluation tool (ASSET). Stress and Health, 20, 189-201.

Finnish Institute of Occupational Health. (2011). Multidimensional work ability model. Helsinki, Finland.

Fochsen, G., Josephson, M., Hagberg, M., Toomingas, A., & Lagerström, M.

(2006). Predictors of leaving nursing care: a longitudinal study among Swedish nursing personnel. Occupational and Environmental Medicine, 63(3), 198-201. doi:10.1136/oem.2005.021956

363

Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18, 39–

50.

Fornell, C., and Bookstein, F. L. (1982). A Comparative Analysis of Two

Structural Equation Models: LISREL and PLS Applied to Market Data, in C. Fornell

(ed.), A Second Generation of Multivariate Analysis, New York: Praeger, 289-324.

Frisch, R. (1934). Statistical confluence analysis by means of complete regression systems. Oslo: Oslo University.

Frisch, R. and Waugh, F. (1933). Partial Time Regressions as Compared with

Individual Trends. Econometrica, 1 (4), 387-401.

Ganster, D. C., Hennessey, H. W., & Luthans, F. (1983). Social desirability response effects: Three alternative models. The Academy of Management Journal 26(2),

321-331.

Geladi, P. (1988) "Notes on the History and Nature of Partial Least Squares

(PLS) Modeling", Journal of Chemometrics, 2, 231-246

Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of Marketing

Research, 25, 186-192.

Gignac, G. E. (2007). Multifactor modeling in individual differences research:

Some recommendations and suggestions. Personality and Individual Differences, 42, 37-

48.

364

Gignac, G. E. (2008). Higher-order models versus direct hierarchical modes: g as superordinate or breadth factor? Psychology Science, 50, 21-43.

Gignac, G. E. (2013). Modeling the Balanced Inventory of Desirable

Responding: Evidence in favour of a revised model of socially desirable responding.

Journal of Personality Assessment, 95, 645-656.

Gignac, G. E., & Watkins, M. W. (2013). Bifactor modeling and the estimation of model-based reliability in the WORK ABILITY INDEX (WAI)S-IV. Multivariate

Behavioral Research, 48, 639-662.

Gilbreath, B., & Frew, E. J. (2008). The stress-related presenteeism scale.

Colorado State University - Pueblo, Hasan School of Business, Colorado State

University – Pueblo, Pueblo, CO.

Glick, W. H., Jenkins, G. D., Jr., & Gupta, N. (1986). Method versus substance:

How strong are underlying relationships between job characteristics and attitudinal outcomes? Academy of Management Journal, 29(3), 441-464.

Goldberger, A. S. (1971). Econometrics and psychometrics: A survey of communalities. Psychometrika, 36, 83-107.

Goldberger, A. S., & Duncan, O. D. (Eds.). (1973). Structural equation models in the social sciences. New York.

Gould, R., Ilmarinen, J., Järvisalo, J., & Koskinen, S. (Eds.). (2008). Dimensions of Work Ability: Results of the Health 2000 Survey. Helsinki, Finland.

365

Grandey, A. A. (2000). Emotion regulation in the workplace: a new way to conceptualize emotional labor. Journal of Occupational Health Psychology 5, 95-110.

Green, S. B., & Hershberger, S. L. (2000). Correlated errors in true score models and their effect on coefficient alpha. Structural Equation Modeling, 7, 251-270.

Green, S. B., & Yang, Y. (2009a). Commentary on coefficient alpha: A cautionary tale. Psychometrika, 74, 121-135.

Green, S. B., & Yang, Y. (2009b). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74,

155-167.

Green, S.B. & Hershberger, S.L. (2000). Correlated errors in true score models and their effect on coefficient alpha. Structural Equation Mode1ing, 7, 251-270.

Griffiths, A., Cox, T., Karanika, M., Khan, S., & Tomas, J.M. (2006). Work design and management in the manufacturing sector: development and validation of the work organisation assessment questionnaire. Occupational and Environmental Medicine,

63, 669-675.

Guo, K.H., Yuan, Y., Archer, N.P., & Connelly, C.E. (2011). Understanding nonmalicious security violations in the workplace: A composite behavior model. Journal of Management Information Systems, 28(2), 203-236.

Gustafsson, J. E., & Balke, G. (1993). General and specific abilities as predictors of school achievement. Multivariate Behavioral Research, 28, 407–434.

366

Guttman, L. (1952). Multiple group methods for common factor analysis: Their basis, computation, and interpretation. Psychometrika, 17, 209-222.

Guttman, L. A. (1945). A basis for analyzing test-retest reliability.

Psychometrika, 10, 255-282.

Haavelmo, T. (1943). The Statistical Implications of a System of Simultaneous

Equations. Econometrica, 11, 1-12.

Hair, J. F. Hult,G., Ringle, C. and Sarstedt M. (2014). A primer on partial least squares structural equation modeling (PLS-SEM). SAGE Publications, Thousand Oaks,

Calif.

Hair, J. F. Hult,G., Ringle, C. and Sarstedt M. (2014). A primer on partial least squares structural equation modeling (PLS-SEM). SAGE Publications, Thousand Oaks,

Calif.

Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2010). Multivariate Data

Analysis, seventh ed. Prentice Hall, Englewood Cliffs.

Hair, J.F., Ringle, C.M., & Sarstedt, M., 2011. PLS-SEM: indeed a silver bullet,

Journal of Marketing Theory and Practice 19 (2), 139-151.

Hasselhorn, H-M., Muller, B.H., & Tackenberg, P. (2005, July). NEXT Scientific

Report. Retrieved 22 September 2015 from http://www.econbiz.de/archiv1/2008/53602_nurses_work_europe.pdf

367

Hauser, R. M., and Goldberger, A. S. (1971). The Treatment of Unobservable

Variables in Path Analysis. Chapter 4 in Sociological Methodology, edited by H.L.

Costner. San Francisco: Jossey-Bass.

Health and Safety Executive Guidelines (2010). Retrieved from: http://www.hse.gov.uk/guidance/ on 12/10/2011.

Heise, D. R., & Bohrnstedt, G. W. (1970). Validity, invalidity, and reliability. In

E. F. Borgatta (Ed.), Sociological methodology 1970 (pp. 104-129). San Francisco:

Jossey-Bass.

Heise, D.R. (1972). Employing nominal variables, induced variables, and block variables in path analysis. Social Methods Research, 1, 147–173.

Hendry, D. F., and Morgan. M. (1989). A Re-Analysis of Confluence Analysis.

Oxford Economic Papers. 41, 35-52.

Holtom, B. C., Mitchell, T. R., & Lee, T. W. (2006). Increasing human and social capital by applying job embeddedness theory. Organizational Dynamics, 35(4),

316–331.

Holzinger, K. J. (1941). Factor Analysis. Chicago: University of Chicago Press.

Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika,

2, 41-54.

Hotelling, H. (1933). Analysis of a Complex of Statistical Variables into

Principal Components. Journal of Educational Psychology, 24, 498-520.

368

Howell, R.D., Breivik, E., & Wilcox, J.B. (2007). Reconsidering formative measurement. Psychological Methods, 12, 205–218.

Hoyt, C. (1941). Test Reliability Estimated by Analysis of Variance,

Psychometrika, 6, 153-160.

Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation

Modeling, 6, 1-55.

Ilmarinen, J. (2003). Work Ability Index: a tool for occupational health research and practise. Paper presented at the 11th Annual EUPHA meeting, Rome, Italy.

Ilmarinen, J. (2007). The Work Ability Index (WAI). Occupational Medicine,

57, 160.

Ilmarinen, J. (2009). Work ability: a comprehensive concept for occupational health research and prevention; Editorial. Scand J Work Environ Health, 35(1), 1-5.

Ilmarinen, J. (2010). 30 years’ work ability and 20 years’ age management.

Paper presented at the 4th International Symposium on Work Ability: Age Management during the Life Course, pp. 12-22, Tampere, Finland.

Ilmarinen, J., & Tuomi, K. (2004). Past, present and future of work ability.

Helsinki, Finland: Finnish Institute of Occupational Health.

Ilmarinen, J., Tuomi, K., & Klockars, M. (1997). Changes in the work ability of active employees as measured by the work ability index over an 11-year period. Scand J

Work Environ Health 23(Suppl 1), 49-57.

369

Ilmarinen, J., Tuomi, K., & Seitsamo, J. (June 2005). New dimensions of work ability. International Congress Series, 1280, 3-7.

Irwin, J. O. (1935). On the indeterminacy in the estimate of g. British Journal of

Psychology, 25, 393-394.

Jackson, P., & Agunwamba, C. (1977). Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: I: Algebraic lower bounds.

Psychometrika, 42(4), 567-578.

Jarvis, C. B., MacKenzie, S. B., and Podsakoff, P. M. (2003). A Critical Review of Construct Indicators and Measurement Model Misspecification in Marketing and

Consumer Research, Journal of Consumer Research 30 (2), 199-218

Jennrich, R. I. & Sampson, P.F. (1966). Rotation for simple loadings.

Psychometrika, 31, 313-323.

Jennrich, R.I., Clarkson. D. B. (1980). A Feasible Method for Standard Errors of

Estimate in Maximum Likelihood Factor Analysis. Psychometrika, 45, 237-247.

Johnson, S., Cooper, C., Cartwright, S., Donald, I., Taylor, P., Millet, C. (2005).

The experience of work‐related stress across occupations. Journal of Managerial

Psychology, 20, 2, 178 – 187.

Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183-202.

Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations.

Psychometrika, 36, 409–426.

370

Jöreskog, K. G. (1973). A General Method for Estimating a Linear Structural

Equation System. In Structural Equation Models in the Social Sciences. Edited by A.

Goldberger and O.D. Duncan. (pp. 85-112), New York: Academic Press.

Jöreskog, K. G., and Sörbom, D. (2001). LISREL 8 User's Reference Guide.

Chicago: Scientific Software International.

Kaiser, H. F. (1958). The Varimax Criterion for Analytic Rotation in Factor

Analysis. Psychometrika, 23, 187-200.

Karimi, L., & Bentler, P. M. (under review). Application of covariate-free and covariate-dependent reliability.

Karimi, L., & Meyer, D. (2014).Validity and Model-Based Reliability of the

Work Organisation Assessment Questionnaire WOAQ Among Nurses. Nursing Outlook.

Klein, L, and Goldberger. A. S. (1955). An Econometric Model of the United

States 1929- 1952. Amsterdam: North-Holland.

Klein, L. (1950). Economic Fluctuations in the United States 1921-1941. New

York: John Wiley.

Kline, R. B. (1998). Principles and Practice of Structural Equation Modeling.

New York: The Guilford Press.

Koopmans, T. (1945). Statistical Estimation of Simultaneous Economic

Relations. Journal of the American Statistical Association, 40, 488-66.

371

Kuder, G. E, & Richardson, M. W. ( 1937). The theory of estimation of test reliability. Psychometrika, 2, 151-160.

LaMontagne, A (2004), Improving OHS policy through intervention research.

Journal of Occupational Health and Safety, 20 (2), 107-113.

Laschinger HKS (2012) Job and career satisfaction and turnover intentions of newly graduated nurses, Journal of Nursing Management 20, 472–484

Law, K. S., Wong, C. S., Mobley, W. H. (1998). Towards a Taxonomy of

Multidimensional Constructs. Academy of Management Review, 23 (4), 741-755.

Lawley, D. N. (1940). The Estimation of Factor Loadings by the method of

Maximum Likelihood. Proceedings of the Royal Society of Edinburgh, 60, 64-82.

Lindell, M. K., & Whitney, D. J. (2001). Accounting for common method variance in cross-sectional research designs. Journal of Applied Psychology, 86, 114-

121.

llmarinen, J. (1991). The aging worker. Editorial. Scand J Work Environ Health,

17 (Suppl 1), 141 p.

Long, J. S. (1983). Confirmatory Factor Analysis, Beverly Hills, CA: Sage

Publications.

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores.

Reading, MA: Addison-Wesley.

372

MacCallum, R. C., & Browne, M.W. (1993). The use of causal indicators in covariance structure models: Some practical issues. Psychological Bulletin, 114, 533-

541.

MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological

Methods, 1, 130-149.

MacKenzie, S. B. Podsakoff, P. M. & Podsakoff, N. P. (2011): Construct

Measurement and Validation Procedures in MIS and Behavioral Research, Integrating

New and Existing Techniques. MIS Quarterly, 35 (2), 293–334.

MacKenzie, S. B., Podsakoff, P. M., and Jarvis, C. B. (2005). The Problem of

Measurement Model Misspecification in Behavioral and Organizational Research and

Some Recommended Solutions, Journal of Applied Psychology, 90 (4), 710-730.

Magnavita N, Mammi F, Roccia K, & Vincenti F (2007). WOA: un questionnario per la valutazione dell’ organizzazione del lavoro. Traduzione e validazione della versione italiana. [WOA: a questionnaire for the evaluation of work organization. Translation and validation of the Italian version]. Giornale Italiano di

Medicina del Lavoro ed Ergonomia, 29, 663-665.

Malhotra, N. K., Kim, S. S., & Patil, A. (2006). Common method variance in IS research: A comparison of alternative approaches and a reanalysis of past research.

Management Science, 52(12), 1865-1883.

373

Mann, H. B., and Wald, A. (1943). On the Statistical Treatment of Linear

Stochastic Difference Equations. Econometrica 11, 173-220.

Marsh, H. W., Hau, K. T., & Grayson, D. (2005). Goodness of fit in structural equation models. In A. Maydeu-Olivares & J. J. McArdle (Eds.), Contemporary psychometrics: A Festschrift for Roderick P. McDonald (pp.225-340). Mahwah, NJ:

Erlbaum.

Martus, P., Jakob, O., Rose, U., Seibt, R., & Freude, G. (2010). A comparative analysis of the Work Ability Index. Occupational Medicine, 60(7), 517-524.

Matsueda R. L. (2012). Key Advances In The History Of Structural Equation

Modeling. Handbook of Structural Equation Modeling. Edited by R. Hoyle. New York,

NY: Guilford Press

McDonald, R. P. (1970). The theoretical foundations of principal factor analysis, canonical factor analysis, and alpha factor analysis. British Journal of Mathematical and

Statistical Psychology, 23, 1-21.

McDonald, R. P. (1999). Test Theory: A unified treatment. Mahwah. N.J.:

Erlhaum.

Meade, A. W., Watson, A. M., & Kroustalis, C. M. (2007). Assessing common methods bias in organisational research. Paper presented at the 22nd Annual Meeting of the Society for Industrial and Organizational Psychology, New York.

374

Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741-749.

Miller, M. (1995). Coefficient alpha: A basic introduction from the perspectives of classical Test Theory and structural equation modeling. Structural Equation

Modeling, 2, 255-273.

Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359-

381.

Morales, M. G. (2011). Partial Least Squares (PLS) Methods: Origins, Evolution, and Application to Social Sciences. Communications in Statistics Theory and Methods,

40 (13), 2305-2317.

Morschhäuser, M., & Sochert, R. (2006). Healthy Work in an Ageing Europe:

Strategies and Instruments for Prolonging Working Life. Essen, Germany.

Mosier, C. I. (1939). Determining a simple structure when loadings for certain tests are known. Psychometrika, 4, 149-162.

Muthén, B. (1984). A General Structural Equation Model with Dichotomous,

Ordered Categorical, and Continuous Latent Variable Indicators. Psychometrika, 49,

115-32.

Muthén, B. (1994). Multi-Level Covariance Structure Analysis. Sociological

Methods and Research, 22, 376-98.

Muthén, B., and Muthén. L. K. (2004). Mplus User’s Guide. Los Angeles, CA:

375

Nelson, C. R. (1972). The Prediction Performance of the FRB-MIT-PENN

Model of the U.S. Economy. American Economic Review. 62, 902-917.

Nunnally, J. C. (1978). Psychometric Theory (2nd ed.), McGraw-Hill, New

York.

Nunnally, J. C., and Bernstein, I. H. (1994). Psychometric Theory (3rd ed.), New

York: McGraw Hill.

Oakman, J., & Wells, Y. (2009). Can organizations influence employees’ intentions to retire? Paper presented at the 3rd International Symposium on Work

Ability: Promotion of Work Ability Towards Productive Aging, pp. 133 -138, Hanoi,

Vietnam.

Palermo, J. (2010). Investigating modifiable organizational factors relating to workability: a focus on gendered culture. Paper presented at the 4th International

Symposium on Work Ability: Age Management during the Life Course, pp. 365-377,

Tampere, Finland.

Palermo, J., Webber, L., Smith, K., & Khor, A. (2009). Factors that predict work ability: Incorporating a measure of organizational values towards ageing. Paper presented at the 3rd International Symposium on Work Ability: Promotion of Work

Ability Towards Productive Aging, pp. 45 -58, Hanoi, Vietnam.

Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge, UK:

Cambridge University Press.

376

Pearson, K. (1901). On Lines and Planes of Closest Fit to Systems of Points in

Space. Philosophical Magazine, 6, 559-72.

Pensola, T., Järvikoski, A., & Järvisalo, J. (2008). Unemployment and Work

Ability. In Gould, R., Ilmarinen, J., Järvisalo, J., & Koskinen, S. (Eds.) Dimensions of

Work Ability: Results of the Health 2000 Survey. Helsinki, Finland.

Petrides K.V. & Furnham A. (2000) On the dimensional structure of emotional intelligence. Personality and Individual Differences 29, 313–320.

Petter, S., Straub, D., and Rai, A. (2007). Specifying Formative Constructs in

Information Systems Research, MIS Quarterly, 31 (4), 623-656.

Podsakoff, N. P., Shen, W., and Podsakoff, P. M. (2006). The Role of Formative

Measurement Models in Strategic Management Research: Review, Critique, and

Implications for Future Research, Research Methodology in Strategy and Management

(3), 197-252.

Podsakoff, P. M., & Organ, D. W. (1986). Self-reports in organizational research:

Problems and prospects. Journal of Management, 12, 531-544.

Podsakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003).

Common method biases in behavioral research: A critical review of the literature and recommended remedies. Journal of Applied Psychology, 88(5), 879-903.

Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004). Generalized Multilevel

Structural Equation Modeling. Psychometrika 69, 167-90.

377

Radkiewicz, P., Widerszal-Bazyl, M., & the NEXT-Study group. (2005).

Psychometric properties of Work Ability Index in the light of comparative survey study.

International Congress Series, 1280, 304–309.

Raftery, Adrian E. 1995. Bayesian Model Selection in Social Research.

Sociological Methodology 25:111-95.

Raykov, T., & Marcoulides, G. A. (2011). 7 Procedures for Estimating

Reliability. In Introduction to Psychometric Theory. (pp160–196). Abingdon, Oxon:

Routledge.

Reise, P., Moore, T. M. & Haviland, M. G. (2010). Bifactor Models and

Rotations: Exploring the Extent to which Multidimensional Data Yield Univocal Scale

Scores. Journal of Personality Assessment. 92(6), 544–559.

Reise, S. P, Bonifay, W. E., & Haviland, M. G. (2012). Scoring and modeling psychological measures in the presence of multidimensionality. Journal of Personality

Assessment, 95, 129-140.

Revelle, W., & Zinbarg, R. E. (2008). Coefficient alpha, beta, omega, and the glb: Comments on Sijtsma. Psychometrika, 74, 145-154.

Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods, 12, 762-800.

Rick, J., & Briner, R.B. (2000). Psychosocial risk assessment: problems and prospects. Occupational Medicine, 50(5), 310-314.

378

Rigdon, E.E. (2012). Rethinking partial least squares path modeling: in praise of simple methods, Long Range Planning 45 (5e6), 341-358.

Ringle, C.M., Sarstedt, M., & Straub, D.W. (2012). A critical look at the use of

PLS-SEM in MIS quarterly, MIS Quarterly 36 (1), iiiexiv.

Ringle, C.M., Wende, S., & Will, S. (2005). SmartPLS 2.0 (M3) Beta,

Hamburg http://www.smartpls.de.

Rogers, W. M., Schmitt, N., & Mullins, M. E. (2002). Correction for unreliability of multifactor measures: Comparison of alpha and parallel forms approaches.

Organizational Research Methods, 5, 184-199.

Roldán, J. L. and Sánchez-Franco, M. J. (2012). Variance-Based Structural

Equation Modeling: Guidelines for Using Partial Least Squares in Information Systems

Research. Research Methodologies, Innovations and Philosophies in Software Systems

Engineering and Information Systems. IGI Global, 193-221.

Roy, S. Tarafdar, M., Ragu-Nathan, T.S. & Marsillac, E. (2012). The Effect of

Misspecification of Reflective and Formative Constructs in Operations and

Manufacturing Management Research. Journal of Business Research Methods, 10 (1),

34-52.

Satorra, A., & Bentler, P. M. (1988). Scaling corrections for statistics in covariance structure analysis (UCLA Statistics Series 2). Los Angeles: UCLA,

Department of Psychology.

379

Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds), Latent variables analysis: Applications for developmental research (399-419).

Saunders, J.B., Aasland, O.G., Babor, T.F., de la Fuente, J.R. and Grant, M.

(1993). Development of the Alcohol Use Disorders Identification Test (AUDIT): WHO collaborative project on early detection of persons with harmful alcohol consumption. II.

Addiction, 88, 791-804.

Schmid, J., & Leiman, J. (1957). The development of hierarchical factor solutions. Psychometrika, 22, 53--61.

Schriesheim, C. A., Kinicki, A. J., & Schriesheim, J. F. (1979). The effect of leniency on leader behavior descriptions. Organizational Behavior and Human

Performance 23, 1-29.

Schumacker, R. E., & Lomax, R. G. (2004). A beginner's guide to structural equation modeling, Second edition. Mahwah, NJ: Lawrence Erlbaum Associates.

Schutte N.S. & Malouff J.M. (1999) Measuring Emotional Intelligence and

Related Constructs. E. Mellen Press, Lewiston, NY.

Schutte N.S., Malouff J.M., Hall L.E., Haggerty D.J., Cooper J.T., Golden C.J. &

Dornheim L. (1998) Development and validation of a measure of emotional intelligence.

Personality and Individual Differences 25, 167–177.

Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6,

461-464.

380

Shipley, B. (2000). Cause and Correlation in Biology; A User’s Guide to Path

Analysis, Structural Equations and Causal Inference. Cambridge, UK: Cambridge

University Press.

Sijtsma, K. (2008). On the use, the misuse, and the very limited usefulness of

Cronbach’s alpha. Psychometrika, 74, 107-120.

Sijtsma, K. (2009b). Reliability beyond theory and into practice. Psychometrika,

74, 169-173.

Skrondal, A., and Rabe-Hesketh, S. (2004). Generalized Latent Variable

Modeling: Multilevel, Longitudinal, and Structural Equation Models. Boca Raton:

Chapman and Hall.

Sočan, G. (2000). Assessment of reliability when test items are not essentially Ʈ- equivalent. Developments in Survey Methodology, 15, 23-35.

Sörbom, D. (1974). A general method for studying differences in factor means and factor structures between groups. British Journal of Mathematical and Statistical

Psychology, 27, 229-239.

Spearman, C. (1904). General Intelligence, Objectively Determined and

Measured. American Journal of Psychology. 15, 201-93.

Spector, P. E. (1987). Method variance as an artifact in self-reported affect and perceptions at work: Myth or significant problem. Journal of Applied Psychology, 72(3),

438-443.

381

Steiger, J. H., & Schönemann, P. H. (1978). A history of factor indeterminacy. In

S. Shye (Ed.), Theory construction and data analysis. Chicago: University of Chicago

Press.

Stober, J. (2001). The social desirability scale-17 (SDS17): Convergent validity, discriminant validity, and relationship with age. European Journal of Psychological

Assessment, 17(3), 222–232.

Taylor, P. (2010). Planning for an Ageing Workforce. Paper presented at the 4th

International Symposium on Work Ability: Age Management during the Life Course, pp. 23-33, Tampere, Finland.

Taylor, P. (Sep 2008). Assessing Workability in the Workplace. Unpublished presentation. OHSIG. Aotea Centre, Auckland, New Zealand.

Taylor, P., & McLoughlin, C. C. (Dec 2011). Pilot Study on Workability.

Monash University. Unpublished presentation. Melbourne, Australia.

Tenenhaus, M., Vinzi, V. E., Chatelin, Y.-M., & Lauro, C. (2005). PLS path modeling. Computational Statistics & Data Analysis, 48(1), 159–205.

Thomas, K. W., & Kilmann, R. H. (1975). The social desirability variable in organizational research: An alternative explanation for reported findings. The Academy of Management Journal, 18(4), 741-752.

Thomson, G. H. (1916), A hierarchy without a general factor. British Journal of

Psychology, 1904-1920, 8, 271–281.

382

Thomson, G. H. (1935). The definition and measurement of "g" (general intelligence). Journal of Educational Psychology, 26, 241-262.

Thurstone, L. L. (1935). The Vectors of Mind. Chicago: University of Chicago

Press.

Thurstone, L. L. (1947). Multiple factor analysis. Chicago: Chicago University

Press.

Treiblmaier, H. Bentler, P., and Mair, P. (2011). Formative Constructs

Implemented via Common Factors, Structural Equation Modeling, 18 (1), 1-17.

Tucker, R. (1955). The Objective Definition of Simple Structure in Linear Factor

Analysis. Psychometrika, 20, 209-225.

Tuomi, K. (1997). Eleven-year follow-up of aging workers; Editorial. Scand J

Work Environ Health, 23(Suppl 1), 66–71.

Tuomi, K., Ilmarinen, J., Jahkola, M., Katajarinne, L., & Tulkki, A. (2006). Work

Ability Index. 2nd revised edition. Helsinki, Finnish Institute of Occupational Health.

Tuomi, K., Ilmarinen, J., Klockars, M., Nygård, C.-H., Seitsamo, J., &

Huuhtanen, P. (1997). Finnish research project on aging workers in 1981-1992. Scand J

Work Environ Health, Suppl 1, 7-11.

Tuomi, K., Ilmarinen, J., Martikainen, R., Aalto, L., & Klockars, M. (1997).

Aging, work, life-style and work ability among Finnish municipal workers in 1981-

1992. Scand J Work Environ Health, 23(Suppl 1), 58-65.

383

Tuomi, K., Ilmarinen, J., Seitsamo, J., Huuhtanen, P., Martikainen, R., C-H, N.,

& Klockars, M. (1997). Summary of the Finnish research project (1981-1992) to promote the health and work ability of aging workers. Scand J Work Environ Health,

Suppl 1, 66-71.

Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and

Psychological Measurement, 58, 6–20.

Vacha-Haase, T., & Thompson, B. (2011). Score reliability: A retrospective look back at 12 years of reliability generalization studies. Measurement and Evaluation in

Counseling and Development, 44, 159-168.

Van der Heijden, B.I.J.M., Van Dam, K., & Hasselhorn, H.M. (2009). Intent to leave nursing. The importance of interpersonal work context, wok-home interference, and job satisfaction beyond the effect of occupational commitment. Career Development

International, 14(7), 616-635.

Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–70.

Warr, P., Cook, J., & Wall, T. (1979). Scales for the measurement of some work attitudes and aspects of psychological well-being, Journal of Occupational

Psychology, 1979, 52(2), 129-148.

384

Webber, L., Smith, K., & Scott, K. (2006). Age, work ability and plans to leave work. Paper presented at the Joint Conference of the Australian Psychological Society and the New Zealand Psychological Society, pp. 479-483, Auckland, New Zealand.

Werts, C. E., Linn, R. L., & Joreskog, K. G. (1974). Interclass reliability estimates: Testing structural assumptions. Educational and Psychological Measurement,

34, 25-33.

Wetzels, M., Odekerken-Schroder, G., van Oppen, C., (2009). Using PLS path modeling for assessing hierarchical construct models: guidelines and empirical illustration, MIS Quarterly 33 (1), 177-195.

Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. In K. J. Bryant, M.

Windle, & S. G. West (Eds.), The science of prevention: Methodological advances from alcohol and substance abuse research (pp. 281–324). Washington, DC: American

Psychological Association.

Williams, L. J., Cote, J. A., & Buckley, M. R. (1989). Lack of method variance in self-reported affect and perceptions at work: Reality or artifact? Journal of Applied

Psychology, 74(3), 462-468.

Williams, L. J., Hartman, N. & Cavazotte, F. (2010). Technique Method

Variance and Marker Variables: A Review and Comprehensive CFA Marker.

Organizational Research Methods, 13 (3), 477-514.

385

Wilson, E. B. (1928a) Review of 'The abilities of man, their nature and measurement' by C. Spearman. Science, 67, 244-248.

Wilson, E. B. (1928b). On hierarchical correlation systems. Proceedings of the

National Academy of Sciences, 14, 283-291.

Wilson, E. B. (1929). Review of 'Crossroads in the mind of man: A study of differentiable mental abilities' by T. L. Kelley. Journal of General Psychology, 2, 153-

169.

Wilson, E. B., & Worcester, J. (1939). A note on factor analysis. Psychometrika,

4, 133-148.

Winkler, J. D., Kanouse, D. E., & Ware, J. E., Jr. (1982). Controlling for acquiescence response set in scale development. Journal of Applied Psychology, 67(5),

555-561.

Wold, H. (1979). Model construction and evaluation when theoretical knowledge is scarce: Theory and application of partial least squares. In J. Kmenta & J. B. Ramsey

(Eds.), Evaluation of econometric models (pp. 47-74). New York: Academic.

Wold, H. (1982). Soft modeling: the basic design and some extensions, In:

Jöreskog, K.G., Wold, H. (Eds.), Systems Under Indirect Observations: Part II. North-

Holland, Amsterdam, pp.1 e54.

Wolfe, A. W. (1966). Factor analysis to 1940. Psychometric Monograph, 3.

386

Woodhouse, B., & Jackson, P. (1977). Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: H: A search procedure to locate the greatest lower bound. Psychometrika, 42(4), 579-591.

Wright, Sewall. (1920). The relative importance of heredity and environment in determining the piebald pattern of guinea-pigs. Proceedings of the National Academy of

Sciences. 6. 320-332.

Wynne-Jones, G., Buck, R., Varnava, C.J., & Main, C. (2011). Impacts on work performance; what matters 6 months on? Occupational Medicine, 61, 205-208.

Wynne-Jones, G., Varnaya, A., Buck, R., Karanika-Murray, M., Griffiths, A.,

Phillips, C., & Main, C.J. (2009). Examination of the work organisation assessment questionnaire in public sector workers. Journal of Occupational & Environmental

Medicine, 51(5): 586-593.

Yang, Y., & Green, S. B. (2011). Coefficient alpha: A reliability coefficient for the 21st century. Journal of Psychoeducational Assessment, 29, 377-392.

Yung, Y. F., Thissen, D., & McLeod, L. D. (1999) On the relationship between the higher-order factor model and the hierarchical factor model. Psychometrika, 64:113–

128.

Zeller, R. A., Measurement in the Social Sciences: The Link Between Theory and Data. Cambridge University Press. http://www.google.com.au/search?tbo=p&tbm=bks&q=inauthor:%22Edward+G.+Carmi nes%22&source=gbs_metadata_r&cad=4 Carmines, E.G. (1980).

387

Zellner, A, and Theil. H (1962). Three-Stage Least Squares: Simultaneous

Estimation of Simultaneous Equations. Econometrica, 30, 54-78.

Zellner, A. (1962). An Efficient Method of Estimating Seemingly Unrelated

Regressions and Tests of Aggregation Bias. Journal of the American Statistical

Association, 57, 348-68.

Zimmerman, D.W. (1972) Test reliability and the Kuder-Richardson formulas:

Derivation from probability theory. Educational and Psychological Measurement, 32,

939-954.

Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, Revelle’s

β, and McDonald’s ωh: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70, 123-133.

388