<<

California State University, San Bernardino CSUSB ScholarWorks

Theses Digitization Project John M. Pfau Library

1987

A meta-analysis of styles of supervision: A reexamination of the Hawthorne findings

Ryan Mark Cherland

Follow this and additional works at: https://scholarworks.lib.csusb.edu/etd-project

Part of the Human Resources Management Commons

Recommended Citation Cherland, Ryan Mark, "A meta-analysis of styles of supervision: A reexamination of the Hawthorne findings" (1987). Theses Digitization Project. 390. https://scholarworks.lib.csusb.edu/etd-project/390

This Thesis is brought to you for free and open access by the John M. Pfau Library at CSUSB ScholarWorks. It has been accepted for inclusion in Theses Digitization Project by an authorized administrator of CSUSB ScholarWorks. For more information, please contact [email protected]. ,A META-ANALYSIS OP STYLES OF SUPERVISION;

A REEXAMINATION OP THE HAWTHORNE FINDINGS - /]■

A Thesis

Presented to the

Faculty of

California State University,

San Bernardino

In Partial Fulfillment

of the Requirements for the Degree

Master of Arts

in

Psychology

by

Ryan Mark jCherland

June 1987 A META-ANALYSIS OP STYLES OP SUPERVISION:

A REEXAMINATION OF THE HAWTHORNE FINDINGS

ThesiiS> ■ .

Presented to the

Faculty of

Galif o r nia Sta te !un i ve r s1 ty«

San Bernardino

■ ■ .-.by ■

Ryan Mark Cherland

June 1987

Approved by:

Dr./yJanet Kottke, Chair/ Date

Dt. Robert Cramer/ Psychology

James Roger anagement ABSTRACT

This thesis examined the relationship of styles of supervision based on the Hawthorne findings with productivity levels or supervisor effectiveness as the independent criterion, Meta-analytic procedures were applied to 20 studies after correcting for sampling error and measurement error. Findings suggested that there was a positive relationship between Hawthorne styles of supervision and the independent criterion. Moderator variables of the Hawthorne styles of supervision included the type of independent criterion used, the job site of the study, and whether the independent criterion was a subjective or objective measure.

Ill ACKNOWLEDGEMENTS

I would like to thank my committee members, Dr, Janet

Kottke, Dr. Robert Cramer, and Dr. James Rogers. Their help and consideration throughout this thesis was extremely helpful and I am sincerely grateful. I would also like to thank Patricia Bartell for her understanding, as well as her reading and editing of my thesis.

IV TABLE OF CONTENTS

Title Page .1

Signature Page...... ii

Abstract...... iii

Acknowledgeitients..,...... iv

Table of Contents...... v

LXst 01 Tab1es...... v x x

Introduction. .1

The Hawthorne ...... 2

Conclusions made about the Hawthorne Experiments....14

Criticisms of the Hawthorne Studies Conclusions.....17

Reanalyses of the Original Hawthorne Data...... 21

Meta-Analysis...... 24

Different Quantitative Review Methods...... 26

Criticisms of Meta-Analysis...... 31

Methods of Meta-Analysis...... 36

Criteria to evaluate/conduct a Meta-Analysis...... 40

Method

Selection of Studies...... 43

Coding of Study Characteristics. .46

Calculation of Effect Sizes...... 47

Cumulation of Effect Sizes...... ^...... 51

Results...... ,...... 54 Discussion

The Hawthorne Findings 68

The Meta-Analytic Method... 72

Summary 74

Appendices

A. Earlier Criticisms of Hawthorne Research 75

B. Reference List of Examined Articles 77

C. Transformation of epsilon to an effect size 92

Combining multiple effect sizes in a study 92

Cumulation of effect sizes 93

Test for heterogeneity of variance 95

Post hoc contrast test 96

D. BASIC Program 97

References. 101

VI LIST OP TABLES

Table 1. Studies used in Meta-analysis. ..48

Table 2. Results of analysis of all the studies examining Hawthorne styles of supervision... .56

Table 3. Results of analyses Using production as the independent criterion and those using supervisor effectiveness as the independent criterion. .57

Table 4. Results of analyses of studies from manufacturing sites, white collar service sites, and blue collar service sites ....59

Table 5. Results of analyses of studies conducted in the 1950s and those conducted after the 1950s.. 60

Table 6. Results of analyses of studies with an average work group size less than or equal to 10 and those with an average work group size greater than 10...... 61

Table 7. Results of analyses of studies with sample sizes less than 100 and those with sample sizes greater than 100 ...... 63

Table 8. Results of analyses of studies with objective independent criteria and those with subjective independent criteria...... 64

Table 9. Results of analyses of studies without a combined effect size and those with a combined effect size...... 66

Vll INTRODUCTION

The first documented scientific examination of industrial relations took place at Western Electric

Company's Hawthorne Works in Chicago, These first studies

have proved to be of great importance to management theory

and the treatment of workers^ and have uhdoubtedly formed a

valuable contribution to the science and art of human

management. The research resulting from the experiments

done at the Hawthorne Works became known as the Hawthdrne

studies. These studies concluded that output was effected

by employee morale, worker solidarity, subtle social /

control processes# and employee attitudes and feelings

(Roethlisberger and Dickson, 1939; Wardwell, 1979). These

conclusions, and the evidence upon which they are based,

have come under criticism on theoretical and methodological

grounds (Carey, 1967; Parsons, 1974; Pitcher, 1981; Franke

& Kaul, 1978; and Schlaifer, 1980).

The lack of any definitive conclusions, and the

continuing controversy over the Hawthorne studies led to

the questioning of. many of the revolutionary conclusions

developed by the original Hawthorne researchers. These

conclusions changed industrial management from an idea

based on the scientific management theories developed by Taylor (Pearson, 1945) to principles of management based on human emotional and motivational factors. Korman (1971, p.

7) states, "It is from these studies that we can date the

'human relations' influence on U.S. management and some of the newer theories of effective leadership." Studies which attempted to clarify the Hawthorne research have been unable to do so. Differing theories abound as to why the

production of the workers increased, with some supporting the original conclusions and others vigorously opposing the ideas of the Hawthorne researchers. One possible way to find out if the conclusions reached by the original

Hawthorne researchers were accurate is to conduct a meta-analysis on those studies that have examined the effect of 'friendly supervision' or democratic leadership styles on worker output.

The Hawthorne Experiments

The Western Electric Company, in cooperation with the

National Research Council of the National Academy of

Sciences, planned in 1924 to examine the relation of quality and quantity of illumination to efficiency in industry (Roethlisberger and Dickson, 1939). The results of these studies were so surprising and unexpected that it

was decided that further research was needed. The illumination studies and the research they inspired became

known as the Hawthorne studies. A brief description and review of each study is important for the reader's orientation of the Hawthorne studies.

Illumination Studies

The illumination experiments were comprised of three experiments. These experiments started in 1924 and lasted for a period of two and one-half years. The first illumination study was conducted in three departments. The first department inspected small piece parts, the second department assembled relays, and the third department wound coils. A baseline measure of production rates was taken under normal lighting conditions for each department. The illumination intensities were then increased to specified levels and new production rates were recorded. The findings showed that no clear relationship existed between production rates and illumination levels. It was decided that an additional study was needed to control or eliminate

"the various additional factors which affect production output in either the same or opposing directions to that which we can ascribe to illumination" (Snow, C.E., cited in

Roethlisberger and Dickson, 1939, p. 15).

The second illumination was designed to prevent the inconclusiveness of the first through the use of specific control conditions. Only one department was used, one which wound small induction coils on wooden spools. The workers were divided into two groups of equal number, equal experience, and equal average output. The test group and the control group were placed in separate buildings to prevent competition, and illumination intensities were once again increased to specific levels.

The results of the study found "appreciable production increases in both groups and of almost identical magnitude.

The difference in efficiency of the two groups was so small as to be less than the probable error of the value" (Snow,

C.E., cited in Roethlisberger and Dickson, 1939, p. 16).

The third illumination experiment was developed to prevent any natural light from illuminating the work area as it did in the first two. The test group and control group were used as outlined in the second Illumination study. With only artifical illumination, the test group was provided with light intensity levels of ten to three foot-candles in steps decreasing one foot candle at a time^

The control group was provided with a constant illumination level of ten foot candles. When the level of illumination decreased the production rates of both the test and control groups increased slowly. Only when the level reached three foot candles did the workers complain and production decrease.

An additional informal study was conducted with only two women workers who were both willing and capable operators. They were given at times, illumination as low as 0.06 foot candles (illumination of an ordinary moonlit night). No decrease in productivity occurred even at this level. Roethlisberger and Dickson (1939) concluded:

Although the results from these experiments on illumination fell short of the expectations of the company in the sense that they failed to answer the specific question of the relation between illumination and efficiency, nevertheless they provided a great stimulus for more research in the field of human relations. They contributed to the steadily growing realization that more knowledge concerning problems involving human factors was essential, (p. 18)

First Relay Assembly Study

Pennock (1930) described that light was only a minor factor in worker output and that, "It was this discovery which suggested to us the use of the experimental method in determining the various factors governing employee effectiveness (p. 298)." It was decided that a small group of workers should be used instead of entire departments to have more control over the variables effecting the workers' output, as well as the use of experimental controls, which were absent in the illumination studies (Pennock, 1930).

Roethlisberger and Dickson (1939) describe the

reasoning for a small group of workers as follows:

In a small group it would be possible to keep certain variables roughly constant; experimental conditions could be imposed with less chance of having them disrupted by departmental routines. It would also be easier to observe and record the changes which took place both without and within the individual. And lastly, in a small group there was the possibility of establishing a feeling of mutual confidence t?etween investigators and operators, so that the reactions of the operators would not be distorted by general mistrust, (pp. 19-20)

The researchers picked a job which was mechanized and repetitive because''it was felt that industry was heading toward this type of labor. In addition it was important that the task be the same for all workers, and that the output be such that a large statistical population could be obtained for each worker. It was decided that the assembly of telephone relays, which was performed by women workers, fulfilled the requirements best. It should be noted, however, that the workers often worked on different styles of relays. To standardize output, a conversion factor was developed to make the output amounts equivalent

(Roethlisberger and Dickson, 1939, pp. 26-27).

In selecting the workers it was concluded that to avoid the element of learning in the experiment only women who were thoroughly experienced in relay assembly should be chosen. Also it was felt that the workers should be willing and cooperative because the researchers wanted the workers' genuine reactions, not false spurts of production due to the experiment, nor reductions of output because of suspicion of the management's intentions. Therefore, "the method adopted for selecting such a group was to invite two experienced operators who were known to be friendly with each other to participate in the test and ask them to choose the remaining members of the group" (Roethlisberger and Dickson, 1939/ p. 21).

The group consisted of six women, five who assembled relays, and one who acted as layout supervisor (a position which consisted of minor supervision, assigning work, and obtaining parts for each assembler). In addition, there was the test room observer, a man whose duties were to keep accurate records of all that happened and to "create and maintain a friendly atmosphere in the test room"

(Roethlisberger and Dickson, 1939, p. 22).

The hypotheses being examined by this first study were described by Pennock (1930) as follows; (1) Do employees actually get tired out? (2) Are rest pauses desirable?

(3) Is a shorter working day desirable? (4) What is the attitude of the employees toward their work and toward the

Company? (5) What is the effect of changing the type of working equipment? (6) Why does production fall off in the afternoon?

The First Relay experiment (Landsberger, 1958), as the initial study became known, consisted of thirteen periods.

Period I of the experiment consisted of recording the weekly production of each worker for the two weeks before the transfer to the test room. This baseline production was used to examine the effect of any future experimental changes. Period II lasted for fiVe weeks and consisted of no changes except the transfer of the women to the assembly test room. This allowed measurement of the effect of the transfer. Period III changed the way the women were paid.

Instead of using the departmental group incentive based on

100 workers, the test workers were made into a separate work group for the purpose of computing piecework earnings.

This allowed each worker to receive a pay which was more closely related to her own productivity. This period lasted eight weeks.

Period IV started the actual experimental investigation. Two rest pauses were introduced in the work day. The rest periods lasted for five minutes—one in

midmorning, and one in midafternoon. Period V increased the length of the rest periods from five to ten minutes.

Period VI examined the effect of six five-minute rest periods. Midmorning and midafternoon snacks were provided in Period VII during those break periods.

During Period VII a personnel problem with two of the operators arose. The workers were described as having a problem "which involved a lack of attention to work and a preference for conversing together for considerable periods of time" (Roethlisberger and Dickson, 1939, p. 53). It was

decided that for the best interests of the experiment the

two women should be replaced by two new workers. The

workers were transferred in the first week of Period VIII. One of the new workers had been in the test room before as a substitute, while the other was placed there on the recommendation of her supervisors.

Period VIII (Landsberger, 1958) examined changes in the length of the working day, while keeping the rest breaks of

Period VII. Instead of working until 5 p.m., the workers were allowed to quit at 4:30 p.m. Period IX ended work at

4 p.m. Period X was exactly the same as Period VII in all details. Period XI dropped the Saturday work day. Period XII*s work schedule went back to that of Period I through

Period Ill's work schedule. Period XIII then changed the work schedule back to that of Period VII and X.

v?hat was found in all these changes was that production increased with an almost unbroken rise, period after period, in both the average hourly and total weekly output.

Pennock (1930, p. 304) stated:

Now this unexpected and continual upward trend in productivity throughout the periods, even in period number 12 when the girls were put on a full 48 hour week with no rest period or lunch, led us to seek some explanation or analysis. Observation and study suggested three possible factors which might contribute to this condition: 1. Relief from cumulative muscular fatigue. 2. Change in the pay incentive. 3. Improved psychological attitude toward the work.

The first hypothesis was rejected because the output of the individual operators showed that the increase in output rates was not dependent on the day of the week nor the time of the day. Physical examination records also showed no signs of cumulative fatigue (Pennock/ 1930).

To examine the effect of the wage incentive factor, two new experiments were developed (Landsberger, 1958). These experiments, known as the Second Relay Assembly group and the Mica Splitting Test Room were ran concurrently.

Second Relay Assembly Study

The Second relay assembly group consisted of five experienced assemblers who were formed into a special group to be paid separately from the rest of the department.

These five assemblers were placed adjacent to each other but remained in the regular department. No other changes

were made. The experiment was made up of three periods:

(1) a base period lasting five weeks; (2) the experimental period, which lasted nine months; and (3) a return to the old method of payment which lasted seven weeks. Production increased by thirteen percent during the experimental period.

The researchers had trouble relating the increase to only the wage incentive program, however. The output of some of the workers taken during the baseline period showed an upward trend. It was also known that there existed a

rivalry with the Relay Assembly Test Room (now in its final stages) during the experiment. It was felt by the

researchers that these factors could account for the

10 increase just as well. Another factor was that the Second Relay Assembly group was regarded with jealousy by the rest of the department, which might have kept production down. (The increase of the Relay Assembly Test

Room was thirty percent.) It was because of this jealousy that the experiment was discontinued to preserve morale among the other workers.

Mica-Splitting Test Room Study

To examine if rest pauses without change in wage incentives would effect worker output the researchers devised the Mica Splitting Test Room study (Landsberger,

1958). Five experienced workers, who were paid on an individual piecework basis, were placed in a room by

themselves to split mica, which was considered one of the

most desirable jobs in the Hawthorne Works plant. During

this time overtime was being worked in the mica department to meet quotas. The study consisted of five periods.

Period I was the baseline measure in the regular department, which lasted eight weeks. Period IX involved the movement of the workers into the test room. This

recording period lasted five weeks. The first experimental

manipulation occurred in Period III when two ten-minute

rest periods were added. This lasted twenty-nine weeks.

Period IV involved no overtime work and two ten-minute

rests, and lasted forty-eight weeks. Period V reduced the

11 work day from eight and three-fourths hours a day to eight hours a day. The work week was reduced from five and one-half days a week to five days a week, plus the two ten-minute rests. This period lasted for seventeen weeks.

The production output rose as expected until halfway through the fourth period, when it started to decline.

Also, the variability of output was greater for the Mica . splitting workers than for the Relay Assembly Test Room workers. The decrease in production was attributed to the worsening economic picture (Period IV occurred during June

1929 to May 1930). The greater variability in output of the workers was due to the lack of group spirit among the mica splitters, according to the researchers (Landsberger,

1958).

Additional Studies

Two additional studies conducted at Hawthorne, but only tangential to the present study, were the Interviewing research and the Bank Wiring study. The Interviewing program at Hawthorne was designed to supply case material for supervisory training courses. It was started when the

Relay Assembly Test Room study was half-way completed. It covered more than 86,000 comments on 80 topics during

10,000 interviews (Landsberger, 1958). Comparisons between men and women workers showed differences in urgency rather than in tone. Men showed a greater interest in matters

12 which affected their and their families' economic security.

Women showed a greater concern with working conditions such as overtime, fatigue, and social contact.

The Bank Wiring study examined how small working groups evolved production standards to which individuals were forced to adhere. Production records showed that most individuals had "straight-line" output curves. One week's output did not differ from other weeks' outputs. The researchers did not introduce any experimental observations except the fact that the workers were being observed. The researchers found that conformity and nonconformity to the group norms seemed to determine whether or not an operator was accepted by the group (Landsberger, 1958).

To summarize, the Hawthorne studies examined the following areas of interest (Pranke and Kaul, 1978): (1)

The Illumination study which examined the physical work environment. It was composed of three studies and suggested that human factors rather than physical working conditions determined worker satisfaction and performance;

(2) The First Relay study which examined physical work environment, physical work requirements, management and supervision, and social relations of workers. It was the major Hawthorne experiment and concluded that increased worker performance was due to improved human relations, and to a somewhat lesser degree, rest pauses; (3) The Second

13 Relay study which looked at management and supervision. It suggested only a moderate effect on performance due to small group incentive pay; (4) The Mica Splitting study which examined the effect of physical work requirements and found that performance was only moderately affected by rest and shorter work periods; (5) Interviewing which studied management and supervision/ and the social relations of the workers. Supported previous conclusions on the importance of social interactions to performance. Found first indications of employees restricting output due to empioyee interrelations; and (6) The Bank Wiring study which examined the social relations of workers and showed how employee interrelations in a large group of workers standardized the pace of work.

Conclusions made about the Hawthorne experiments

some of the conclusions made by the Hawthorne researchers were described by Mayo (1945);

There has been a continual upward trend in output which has been independent of the changes in rest pauses. This upward trend has continued too long to be ascribed to an initial stimulus from the novelty of starting a special study. The reduction of muscular fatigue has not been the primary factor in increasing output. Cumulative fatigue is not present... There has been an important increase in contentment among the girls working under test-room conditions. There has been a decrease in absences of about 80 per cent among the girls since entering the test-room group. Test-room operators have had approximately one-third as many sick absences as

14 the regular department during the last six months... Output is more directly related to working day than to the number of (working) days . ■ ■ ■ ■ ■ ■ , ;i:n'the-^'week:. Observations of operators in the relay assembly test ropni indicate that their health is being maintained or improved and that they are working within their capacity,.. The changed working conditions have resulted in creatihg ah eagernhss on the part of operators to come to work in the morning... Important factors in the production of a better mental attitude and greater enjoyment of work have been the greater freedom, less strict supervision and the opportunity to vary from a fixed pace without reprimand from a gang boss. The operators have no clear idea as to why they are able to produce more in the test room; but as shown in the replies to questionnaires...there is the feeling that better output is in some way related to the distinctly pleasanter, freer, and happier working conditions... (pp. 65-67)

Of these conclusions, Roethlisberger and Dickson

(1939) believed that the following two warranted the highest consideration:

At least two conclusions seemed to be warranted from the test room experiments so far: (1) there was absolutely no evidence in favor of the hypothesis that the continuous increase in output in the Relay Assembly Test Room during the first 2 years could be attributed to the wage incentive factor alone; (2) the efficacy of a wage incentive was so dependent on its relation to other factors that it was impossible to consider it as a thing in itself having an independent effect on the individual. Only in connection with the interpersonal relations at work and the personal situations outside of work, to mention two important variables, could its effect on output be determined, (p. 160)

15 other explanations that were developed included the fact that the workers found themselves to be experimental subjects, were under less autocratic supervision, as well as factors such as teamwork, cohesiveness, informal organization, interpersonal relationships, and social unity

(Parsons, 1978). But whatever the explanation, the importance of the Hawthorne studies to management perspectives cannot be denied. As stated by Roethlisberger and Dickson (1939);

Hitherto management had tended to make many assumptions as to what would happen if a change were made in, for example, hours of work or a wage incentive. They now began to question these assumptions and saw that many of them were oversimplified. They began to see that such factors as hours of work and wage incentives were not things in themselves having an independent effect upon employee efficiency; rather, these factors were no more than parts of a total situation, and their effects could not be predicted apart from the total situation, (p. 185)

It was these two conclusions by Roethlisberger and

Dickson which developed into the fundamental writings of industrial and social relations. Books such as Management and the Worker (Roethlisberger and Dickson, 1939),

Management and Morale (Roethlisberger, 1941), The

Industrial Worker (Whitehead, 1938), Leadership in a Free

Society (Whitehead, 1936), The Human Problems of an

Industrial Civilization (Mayo, 1933), and The Social

Problems of an Industrial Civilization (Mayo, 1945)

16 describe the research conducted at Hawthorne and its implications for management. The influence of these

writings on Industrial/Organizational psychology and

management theory changed the emphasis of management from one based on simple pay incentives to one based on human

relations/ styles of leadership/ group standards/ and other

social factors on work performance—^to the exclusion of

almost all other approaches throughout the 1950s and 1960s.

Criticisms of the Hawthorne Studies' Conclusions

Although the conclusions of the Hawthorne studies did

not escape complete critical analysis (Landsberger/ 1958)/

it was not until the mid 1960's when the Hawthorne studies

were brought back under scrutiny. Some of the more recent

articles which have examined the Hawthorne studies and

their conclusions will be reviewed. Since these articles

review previous research/ it is felt that the older studies

would be redundant in many cases. This is by no means an

exhaustive review/ but is intended to provide the reader

with the most recent analyses which are representative of

the field. (For a list of the earlier criticisms/ the

reader is referred to Appendix A.)

The first relatively recent criticism of the Hawthorne

studies was conducted by Carey (1967) in which he concluded

that the decisions reached by the Hawthorne investigators

were not supported by the evidence. Carey felt that it was

17 the material rewards of money which influenced work morale and behavior the most. Carey supported his conclusion with the following factors: (1) Apart from a slight increase in output (4-5%) following the introduction of the preferred incentive system, there was no increase in weekly output during the first nine months of the study. (2) When it became apparent that free and friendly supervision was not getting results, discipline was tightened, which led to the dismissal of two of the five women. (3) The dismissed workers were replaced by two women of special and character who immediately led the rest of the group in a sustained increase in output. One of the two women who had a special need for money rapidly undertook a strong disciplinary role in regard to the rest of the group. (4)

Output only showed an increase when the two women with the lowest output were replaced with the two, new motivated workers who accounted for the major part of the groups' increase. (5) Only after the two new women arrived and the resulting Increase in output occurred, did supervision once again turn friendly and relaxed. There is no evidence that output increased because supervision turned friendly.

Carey (1967, p. 416) concludes that "far from supporting the various components of the 'human relations approach,' [the results] are surprisingly consistent with a rather old-world view about the value of monetary

18 incentives, driving leadership, and discipline."

Carey's criticism is minimized by Shepard's (1971) response in which Shepard contends that the Hawthorne researchers did not try to reduce the effects of financial •w rewards and overemphasize the effects of friendly supervision. In fact, Shepard shows how Mayo acknowledges that the results could not be due simply to differences in supervision, but resulted from something more, which Mayo described as 'human situations.' Shepard concluded that a primary contribution of the Hawthorne researchers remains their attempt to place financial incentives in a social context.

Bramel and Friend (1981) criticize the conclusions of

Mayo and Roethlisberger as reflecting a capitalistic

managerial view about workers. Bramel and Friend (1981) state;

...these two pioneers were probably important in preserving a view of workers as irrational and unintelligent and of the capitalist factory as nonexploitative and free of class conflict. This view, which is clearly identified with defense of the capitalist mode of production, persists to the present time in discussions of the psychology of industry and particularly in reference to the Hawthorne research, (p. 867)

Bramel and Friend support their viewpoint with a review

of instances of worker resistance at Hawthorne and the way

the experimenters handled the resistance (e.g., the

replacement of two workers during Period VIIX). Bramel and

19 Friend also provide evidence that Mayo and Roethlisberger trivialized and hid any worker discontent through their explanations of the workers' statements and actions as being irrational, emotional, and based on misunderstandings between management and the workers. Bramel and Friend believe that Mayo and Roethlisberger have misstated the facts to provide validity to the capitalistic social order over a Marxist explanation that class bias existed at

Hawthorne.

Bramel and Friend's analysis has been criticized by

many researchers (Toch, 1982; Stagner, 1982; Parsons, 1982;

Feldman, 1982; Locke, 1982; Vogel, 1982; and Sonnenfeld,

1982). All disagree with the Marxist content used by

Bramel and Friend as being inappropriate for an explanation

of what occurred at the Hawthorne factory, and some state

that Bramel and Friend are also guilty of misstating the facts to support their conclusions. Given Bramel and

Friend's ideological analysis and the extensive criticisms, there are two valid points which Bramel and Friend have

made: (1) That there were management pressures to increase production on the experimental workers during the experiment, and (2) Mayo and Roethlisberger glossed over

problems with worker discontent during the experiments,

which could have influenced experimental results.

Another article (Parsons, 1974) which questioned the

20 conclusions of the Hawthorne researchers decided that the

Hawthorne effect was in actuality operant conditioning.

This conclusion is based on the fact that the workers were notified of their work out-put. Given this occurrence.

Parsons believed that a combination of information feedback and financial reward caused operant conditioning to occur which was seen in the progressive increases in response rate.

Balling, Weiss, and Steigleder (1985) explain the increase in production at Hawthorne using a NeoHullian learning theory moc^el, which states that behavior can be motivated by aversive drives such as fear, frustration, altruistic drive, and effectance. The experimental workers were placed in conditions which Balling et al. viewed as being aversive (i.e., constant supervision, constant evaluation, managerial discipline, fear of being laid-off, etc.). These aversive conditions motivated the workers to work at higher production levels, and therefore, account for the "Hawthorne effect." Although this view is similar to that of Mayo and Roethlisberger in that it emphasizes attention. Balling et al. believe that the attention was an aversive motivator while Mayo and Roethlisberger viewed the attention as being a positive motivator.

Reanalyses of the Original Hawthorne Data

The first statistical analysis of the Hawthorne studies

21 was conducted by Franks and Kaul (1978). The original data of the First Relay experiment was statistically analysized using time-series econometric techniques. These techniques allow specific examination of even inadyertent experimental changes. Also, through the use of serial correlation, the influence of historical factors could be measured. Franke and Kaul at first performed Zero-order correlations for the first relay group—using all the periods, then separating those seven periods prior to the replacement of the two unsatifactory workers, and then the periods after the replacement. Next, the best multiple regressions were determined using the three production measures (hourly output, weekly output, and repair time), with correction for serial correlation where necessary. Finally, Franke and Kaul, fordid the group models on the individual workers' data and then determined whether alternate models provided a better explanation of the variance. It was found that experimental variables accounted for over 90% of the variance in quantity and quality of output. The variables of managerial discipline, economic adversity, time set aside for rest, and quality of raw materials explaihed most of the experimental variance.

Schlaifer (1980) provided an alternative statistical analysis which used models in Which the average productivity increased gradually over time instead of in

22 abrupt jumps as productivity did in the model developed by

Franke and Kaul (1978). Schlaifer's models found discipline to be less of a factor than Franke and Haul's model. This was attributed to Franke and Haul's model not including a smooth function of time among the predictor variables for their stepwise regression program. Because this predictor variable was not included, their program

used the discipline variable as a proxy for the smooth function of a time variable. The relationship between discipline and the smooth trend variable was found to be a correlation coefficient of .84 (Schlaifer, 1980).

Pitcher (1981) reported statistical evidence for a learning model. By fitting a standard learning equation to the data for each operator, evidence was found to support the conclusion that the increased output at Hawthorne was due to conditions which motivated increased learning.

These conditions were due to better reinforcement situations such as increased status and new economic incentives, and regular performance feedback.

All these alternative explanations for-what happened at

the Hawthorne Works point away from the original conclusions made by the Hawthorne researchers, that frieKdly supervision had a significant impact on the output

of workers. Because the impact of the conclusions made at

Hawthorne are so important to management theory, and given

23 the many conflicting reanalyzes of the Hawthorne research, it: is desirable to examine whether or not additional research on the effect of friendly supervision supports or refutes the original Hawthorne conclusions. To do such an examination one must use the studies which have already been conducted and combine them in some understandable manner. Recently there has developed a quantitative method in which a large amount of previous research can be analyzed as a single database. This quantitative review method has become known as meta-analysis.

Meta-Analysis

The term meta-analysis was first used by Glass (1976) to describe his metbod of a quahtitative review of research literature. This type of review is also called research integration, and quantitative assessment of research domains (Green and Hall, 1984). Meta-analysis is nothing more than one out of several qualitative methods in which research findings are summarized and reviewed. The difference is that meta-analysis uses numbers and statistical methods for organizing large quantities of data and then extracts information from the data. It allows researchers to generalize without doing "violence to a more useful contingent or interactive conclusion" (Glass,

McGraw, and Smith, 1981, p. 55). Instead of adding to a nearly incomprehensible mass of previous research, one can

24 examine a research area scientifically/ discover what has occurred significantly in the past, locate any connections between the experiment effect and methodological conditions/ and then generalize to the appropriate domain and suggest areas of future research. Meta-analysis was developed to provide reviewers with an objective method of review in comparison to traditional methods.

The literature supports the idea of a quantitative review method. Sawyer (1966) found that the statistical modes of both data collection and combination were superior to the clinical methods which relied on subjective judgements. A more comprehensive analysis was performed by

Jackson and his conclusions were cited by Glass et al.

(1981) as follows;

(a) Reviewers frequently fail to examine critically the evidence, methods and conclusions of previous reviews on the same or similar topics. [Although 75 percent of the reviewers cited previous reviews, only 6 percent examined them critically.] (b) Reviewers, often focus their discussion and analysis on only a part of the full set of studies they find, and the subset examined is seldom a representative sample and it is seldom clear how it (the subset) was chosen. [Only 3 percent of the reviewers appeared to have used existing indexes—e.g., ERIC—in their search; only 22 percent selected a fair sample of studies, in the judgement of Jackson's coders; and only 3 percent analyzed the full set of studies found.] (c) Reviewers frequently use crude and misleading representations of the findings of the studies. [About 15 percent of the reviewers classified studies according to whether their findings were 'statistically significant'; ...frequently.

25 reviewers report test statlstics(t^, F, etc.) for one or more studies.] (d) Reviewers sometimes fail to recognize that random Sampling error can play a part in creatihg variable findings among studies. (e) Reviewers frequently fail systematically to assess possible relationships between the characteristics of the studies and the study findings. [Fewer that 10 percent of the reviewers studied whether the findings of the research were mediated by characteristics of the persons Studied/ the study context, the nature of the experimental intervention or the characteristics of the research design.J The lack of systematic examination of these relationships is important because feviewers frequently eliminate studies from consideration because of a priori judgements that their findings are flawed by one or another study characteristics. (f) Reviewers usually report so little about their methods of reviewing that the reader cannot judge the validity of the conclusions, (p. 13)

Different Quantitative Review Methods

One of the first quantitative review methods deyeloped was vote-counting. This method required the reviewer to classify the literature as supporting, conflicting, or neither supporting nor conflicting. The category which contained the most "votes" of number of studies was taken as the explanation for the phenomenon (Light and Smith,

1971). This technique did not take into account either the experiments• sample sizes nor experimental effect sizes and this reduced its effectiveness. When one does not take effect size into account, the degree to which the examined phenomenon is present in the population of interest is not known nor understood.

26 The importance of effect size should not be ignored.

Cohen (1977/ pp. 9-10) described effect size as "the degree to which the phenomenon is present in the population or the degree to which the null hypothesis is false." The larger the effect size the greater is the manifestation of the experimental effect/ and intuitively/ the greater the experiment's power. In other wordS/ if there are differences in the experimental population/ they will be found. Cohen (1979) states;

The larger the ES[effect size] posited/ other things (significance criterion/ sample size) being equal/ the greater the power of the test. Similarly/ the relationship between ES and necessary sample size: the larger the ES posited/ other things (significance criterion/ desired power) being equal/ the smaller the sample size necessary to detect it. (p. 11)

Effect size is calculated using different equations for different statistical tests (ts,^s, rs, etc.). Glass et al. (1981) state that the most informative and straightforward measure is the mean difference between experimental and control groups divided by within-group . : d=(XE - ir(,)/s^

Whatever the method used/ the need for studies to publish effect sizes is emphasized in one instance by Rosenthal

(1978/ p. 192)/ "We owe it to our readers to give...an estimate of the probable size of the effect in terms of a

Sigma unit/ a correlation coefficient/ or some other

27 estimate»" Providing the readers effect size measurements

allows the readers to make judgements about the

experimental effect and the relationship of the dependent

and independent variabls(s). It also allows one experiment

to be compared with similar experiments because the effect

size can be transformed into ^ and , which

allows it to be compared across different statistical

■tests., Getting back to quantitative reviews, another method

involves the cumulation of significance levels across

studies^ The probabilities obtained from two to more

studies are combined by using methods which included adding

logs, adding probabilities, adding ^s, adding Z_s, adding weighted ^s, testing the mean £, testing the mean _Z,

counting, and blocking (Rosenthal, 1978). But again, these

methods do not include any way of measuring effect sizes

which hinders their use.

r Mfeta-analysis is V most sophisticated

'literature review method available (Tannenbaum and Jones,

1983). Through the use of a systematic, comprehensive

review of all related literature, both published and

unpublished, the reviewers collect and convert the effect

sizes. The conversion of the effect sizes to a common

metric, such as a correlation coefficient, r, allows for

different experiments, using different statistical

28 measures/ to be compared. The effect sizes are coded/ that iS/ recorded onto a coding sheet which lists other variables of interest as well. The coding sheets are devised by the person conducting the meta-analysis and classify the characteristics of the study into two broad areaS/ substantive and methodological. Substantive variables of interest are those characteristics of the study that are specific to the problem being studied. In a study which examines a drug treatment for example/ substantive characteristics would include the type of drug/ the size of the dose/ and the age of the subject/ to name a few. Methodological variables of interest are those characteristics of a study which deal with the research methods of the studies (Glass et al./ 1981). Examples of methodological characteristics include sample size/ test reliability/ randomization versus matched versus noneguivalent groups/ etc. This coding is standardized and applied to each study. When the studies have been coded the effect sizes are regressed against the variables of interest. This allows the variables to determine the factors which contribute to the variance of results across studies. Methods other than regression may be used but as with any statistical technique/ the methods should be fully supported by the literature (Tannenbaum and JoneS/ 1983).

An example of a meta-analysis is given by Glass et al.

29 (1981, pp. 26-31). Twelve studies were found to have tested the effects of psychotherapy on asthma. Eleven of the studies involved the use of treatment and control groups, two others used pretest-posttest designs. The effect size was calculated by subtracting the control mean from the treatment mean and dividing the difference by the control group standard deviation. The results of the

C ■ meta-analysis showed that the mean standard deviation was

0.86, which means that the average subject who received psychotherapy was 0.86 standard deviations above the mean of the control groups.

The relationship between the effects of psychotherapy and some of the features of the therapy and the patients were examined. It was found that there was no significant difference between the type of psychotherapy and the effect sizes. There was a significant effect found in the age of the patient and the mean effect size, with a linear correlation of .40 being found. Hours of therapy was found not to be significant nor was the number of weeks of post therapy. This study showed that pschotherapy, primarily behavioral therapies and hypnotherapy, have a large ameliorating effect on asthma sufferers, with the age of the patient contributing to the effectiveness of the treatment.

Meta-analysis is therefore an effective means of

30 combining a large number of studies into a single data-base which allows a reviewer to perform an indepth analysis on research in a specific area of interest. There are however, many who feel that there are no valid reasons to support the aggregation of studies into a single database.

In addition, the researcher conducting a meta-analysis is using studies which possibly define variables differently.

The arguments against meta-analsis Should therefore be examined.

Criticisms of Meta-Analysis

Arguments against meta-analysis were given by Glass et al. (1981, p. 218) as;

(1) The Apples and Oranges Problem. It is illogical to compare 'different' studies, that is, studies done with different measuring techniques, different types of persons, and the like. (2) Use of Data From 'Poor' Studies. Meta-analysis advocates low standards of quality for research. It accepts uncritically the findings from studies that are poorly designed or are otherwise of low quality. Aggregated conclusions should only be based on the findings of 'good' studies. (3) in Reported Research. Meta-analysis is dependent on the finding that researchers report. Its findings will be biased if, as is surely true, there are systematic differences among the results of research that appear in journals versus books versus theses versus unpublished papers. (4) Lumpy (Nonindependent) Data. Meta-analyses are conducted on large data sets in which multiple results are derived from the same study; this renders the data nOnindependent and gives one a mistaken impression of the reliability of the results.

31 These criticisms are countered by the following arguments: (1) Glass et al. (1981) argue that the studies compared in a meta-analysis are similar to the way in which different subjects are compared in traditional research.

That iS/ a researcher who critizes the pooling of the results of one, five, or ten, studies should explain why there is nothing objectionable in the pooling of the results from one, five, ten, or one hundred subjects.

Another counter to this criticism is there is no need to compare studies that are the same because the results should be the same as well, within statistical error. It must be remembered that the formulation of a review is contingent on the nature and scope of the question or hypothesis being examined. If the hypothesis being examined requires a global analysis, one can always later stratify the sample into smaller, more homogenous groups for a more conceptually refined analysis.

(2) Although meta-analysis uses studies of "poor" quality, it does not advocate poor research design. The methodological strength of each study can be coded and taken into consideration in the analysis. However, Hunter and his colleagues (Hunter et al., 1982) have developed a method which has the added advantage of recognizing and correcting for some of the artifactual and methodological problems affecting the results of the studies to be

32 combined.

A Schmidt and Hunter type of meta-anlaysis is based on the idea that much of the variation in results across samples or studies is due to statistical artifacts and methodological problems rather than to truly substantive differences in underlying population correlations. There are three types of error variance which can be corrected by the literature reviewer: (a) sampling error due to differences in sampling size, (b) measurement,error due to imperfect instruments, and (c) range variation, which occurs when the independent variable varies more or less in the population being studied than in the reference population. One other type of error is reporting error.

This includes incorrect computations, typographical errors, and the like, but this type of error is uncorrectable without examining the original data for each study.

(3) Although there do exist differences in what medium a study is published in and its research results, this cannot be considered a cogent criticism of meta-analysis, which was able to demonstrate the existence of such biases.

Selective publication can be dealt with using meta-analytic procedures by collecting all of the research and analyzing it separately by mode of publication. In addition, when published data sets are compared to large unpublished data sets (U.S. Government studies), they are very similar in

33 terras of raeans and variance (Schraidt, Hunter, Pearlraan, and Hirsh, 1985). Roserithal's (1979)^^^ ^^^m to determine the number of unlocated studies averaging a mean Z_ of .00 that would be required to change the conclusions of a quantitative research review show that the required number of missing studies are usually so many as to have little possibility of existing (e.g., 200 to 10,000+). Unpublished studies are not usually the result of nonsignificant results but rather methodological weaknesses

(Schmidt and Hunter, 1977). Hunter et al. (1982, p. 30) state, "It seems likely that most of the difference between the average effect size of the published and unpublished studies is due to differences in the methodological quality. If attenuation effects were properly corrected for, differences might disappear."

(4) When multiple results are derived from the same study, this causes the data to become nonindependent. This criticism is acknowledged as being quite cogent.

Nonindependence reduces the reliability of estimation of averages or of regression equations. One solution to the problem would be to average all findings within a study up to the level of the study and proceed with a meta-analysis with "studies" as the unit of analysis, instead of the number of effects. In some cases, however, this technique will hide many important questions from analysis. An

34 alternative solution is to utilize analytic procedures that take the problein of nonindependence into account, but the cost is a more complex analysis and greater conceptual distance from the original constructs of interest. A further alternative is to specify a priori a particular measurement instrument or type of instrument and to select only that type when multiple measures are used. For example, one would select a particular measure which is common in a particular domain of research or a particular type of measure common to the research area. Only data from these measures would be Used even if several other types were reported in the study.

The most conservative approach is to record the effect size for each measure and then pool the estimates of

multiple effect sizes in each study in the initial analysis

(Rosenthal, 1984). A recent method to combine the multiple effect sizes of a study has been developed by Rosenthal and

Rubin (1986). This method Incorporates the degrees of freedom and the Intercorrelatlon among the dependent

variables, which provides a more accurate and useful summary effect size.

Tannenbaum and Jones (1983, p. 13) feel that "none of these reasons seem severe enough to prevent meta-analysis from becoming a more prevalent research tool." By

examining experiments in an objective, scientific manner.

35 the researcher can make statements supported by the aggregation of data. This is to be preferred over those statements which describe previous research in general terms but decide that specific and definitive conclusions from a general survey of the literature can be made.

Meta-analysis not only offers a way to cumulate the results of many studies but it also allows an examination of those methodological factors of experiments which effect the results, something that no other methods of literature reviews offer.

Methods of Meta-Analysis

Although all methods of meta-analysis share some common attributes, such as standardization, comprehensiveness, documentation, and quantification, there are differences in the two main approaches, one which was developed by Glass, and the other by Hunter.

Glass, who first used the term meta-analyis to describe his quantitative technique for research review, emphasizes a need for the computation of effect sizes when integrating studies. This allows the researcher to examine how study characteristics affect summary findings. It also provides an estimate of overall mean and variance of the effect sizes (Glass et al., 1981). Glass* approach encourages a complete search through existing sources for studies.

Coding is then conducted on all factors which might

36 influence the experimental effect. Interrater reliability is calculated for the coders to insure uniformity in the procedure. The experimental effect sizes for each study is then calculated into a common metric. Methods for these conversions are listed by both Glass and Hunter (Glass et al./ 1981; Hunter, Schmidt, and Jackson, 1982). Briefly, however, some of the conversions are listed below;

r = + df error

F(l,df error) F(l,df error) + df error

N

Once the database is established, any appropriate statistical procedure may be used to assess which factors contribute to the variance of results across studies.

Schmidt and Hunter developed their meta-analytic techniques as an extension of their work in the area of validity generalization procedures for employment tests.

Hunter et al. (1982) follow the same procedures as Glass except for the inclusion of procedures to correct for artifactual variance before the coding of the moderator

37 variables. These potential sources of artifactual variance include; sampling error, differences across studies in the

reliabilities of predictors and criterion measures, and the differences between studies in the degree of range

restriction. Other sources of artifactual variance exist,

but estimates of these sources are considered unobtainable.

The procedures are done to remove as much error variance as

possible for the observed findings. When the researcher

has corrected for sources of artifactual variance, he/she

then selects those moderator variables which have logical,

statistical, or psychometric justifications with which to

be included in the analysis. Hunter et al. have criticized

the Glass' approach of coding from 50 to 100 study

characteristics as capitalizing on chance factors due to

the large number of characteristics that were coded.

However, Hunter et al. do concede that if the estimated

variance of effect sizes across studies is substantially

greater that zero after corrections for artifacts have been

made, then Glass' approach would be a supplemental step to

the Hunter etal. procedures.

These two meta-analysis procedures approach literature

reviews from different philosophical perspectives according

to Mathieu and Tannenbaum (1983):

One might regard the Hunter approach as a confirmatory procedure aimed at specifying "what is known", while the Glassian method seems closer to exploratory techniques which permit unexpected

38 relationshipis to surface (p. 7).

These differences in approach shpuld not limit the meta-analyst to restricting the quanti^t^ review to only one way of analysisjr and Mathieu and Tannenba:um {1983) recommend a combination of the two approaches which allows for an a priori specification of the primary variables as well as the listing of secondary or exploratory variables. A meta-analysis done in this way will result in confirmation or denial of existing data, as well as point to areas of future interest.

Statistical models have been developed for effect size analysis (Strube, Gardner, & Hartmann, 1985). A meta-analysis on effect sizes infers that the data are derived from the population of studies which can be accurately described by a statistical model. It is assumed that the meta-analyst wishes to treat the studies under review as a sample of observations concerning the true effectiveness of a treatment. The evidence provided by the sample is then used to estimate the true values of parameters characteristic of the population. Hedges (1981,

1982a, 1982b, 1982c, 1983) has distinguished two general types of effect size, a fixed effects model and a random effects model. The difference between the two is analogous to that made in the analysis of variance. Strube et al.

(1985) have summarized the models in the following manner:

39 In the fixed effects model, the studies can be viewed as random samples from a population characterized by a single, fixed effect size. Under the assumption of valid measurement without error (a rather untenable assumption in practice) sample effect sizes will deviate from this fixed population value as a function of sampling error. Furthermore, it is possible for more than one fixed population value to exist and be estimated by the sample. In this latter instance, variability in sample estimates reflects an additional component (i.e., that there are two or more parent populations). For example, diverse treatments may be represented by discrete population effect sizes. An alternative random effects model proposes that the population effect size is randomly distributed with its own mean and variance, rather than having a fixed value. Thus variability in sample effect sizes reflects not only sampling error, but variability in the parent population as well (i.e., the sample values are estimating different population values)(pp. 71-72).

Criteria to evaluate/conduct a Meta-Analysis

To perform an evaluation of the literature as involved as a meta-analysis, one must follow some guidelines and suggestions which prevent common mistakes. In an attempt to replicate a meta-analysis. Bullock and Svyantek (1985), identified some potential problems they encountered. These problems included the public availability of a list of the studies used, lack of adequate coding documentation and decision rules, a narrow domain of generalization (i.e., a meta-analysis was done which used only studies found in one specific journal), inadequate coding of study characteristics, selective reporting of results, and the overinterpretation of the results.

40 Therefore, one should be able to conduct a meta-analysis which is informative and helps describe a research area by holding the coding of the studies' results to a minimum to prevent overinterpretation and capitalizing on chance factors, as well as correcting for sampling error and measurement error. 3y holding to these criteria, it is felt that a meta-analysis on Studies which have researched the relationship between "friendly supervision" and production or some other measure of supervisor

effectiveness will help in clarifying the conclusions of

the Hawthorne researchers regarding friendly supervision.

The basis for much of management theory involves the

"friendly supervision" hypothesis put forth by the

Hawthorne researchers, and deserves a thorough review which

can only be provided by a quantitative literature review.

This meta-analysis will only examine studies which were

conducted on supervisors in the work place. This group of

studies has settings which are the most similar to the

original Hawthorne environment. Since the purpose of the

Study is to find either support for or to dispute the

original Hawthorne conclusions, this study will not examine

any research which was in a laboratory setting or which

examined leadership styles as opposed to supervisory

styles. Although this limits the size of the database it

is felt that given the purpose of this study the

41 limitations are appropriate.

42­ METHOD

Selection of Studies

An attempt was made to locate, summarize, and analyze

the results of all published studies reporting the effects of different styles of supervision — in particular, those

judged to be based on human relations methods — on

production levels or some measure of supervisor

effectiveness. Some of the possible characteristics of a

human relations manager are discussed by Gordon (1958):

(1) He permits all members to discuss policy formation. Encourages the group to make necessary decisions. (2) He permits discussions of future as well as present activity. Does not try to keep members in the dark about future plans. (3) He permits members to define their own job situation as much as possible. For example, the defining of the way to accomplish tasks and the division of tasks is left up to the group. (4) He focuses on obtaining objective facts on human problems. Tries to base any necessary praise or discipline upon these objective facts and not upon his personal needs, (p. 420)

A limitation on the sample of studies to be selected

was that they were conducted in an actual work setting and

not be a laboratory experiment. This limitation was used

because the Hawthorne findings were based in a work setting

and this meta-analysis is an attempt to examine the

strength of the Hawthorne findings using results from

43 similar environments. Another requirement was that the studies had to have usable statistical analyses to calculate an effect size. A usable statistic was considered any measure which describes the relationship between any variable X and any variable Y. Just about any test statistic can be converted into a usable effect size measure. Even studies which list only a significance level can have an effect size approximated given the size of the sample. Most studies which were not included in the final sample were discarded because of their failure to meet the setting or general topic requirement rather than the statistical requirement.

A literature search was performed manually Pn

Psychological Abstracts through the years 1945 to 1969. It was felt that since the Hawthorne findings were not made public until 1939 (Management and the Worker,

Roethlisberger and Dickson, 1939) and with the advent of

World War II/ few, if any, studies would be conducted and published before 1945 (it was found that the earliest usable study was published in 1952, giving some credence to this sampling criterion). The manual literature search ended with 1969 journal articles because a computer literature search, which was conducted, would cover thw years after 1969.

The computer literature search used the DIALOG

44 Information Retrieval Service on the PSYCINFO/ ABI/INFORM,

MANAGEMENT CONTENTS, and the DISSERTATION ABSTRACTS ONLINE databases. PSYCINFO covers the world's literature in psychology and related disciplines in the behavioral sciences from 1967 to the present, and scans over 900 periodicals and 1500 books, technical reports, reports, and monographs each year. ABI/INFORM covers the literature from 1971 to the present and examines more that 500 publications in business and management. MANAGEMENT

CONTENTS covers the years 1974 to the present and specializes in current information from approximately 500 journals, proceedings, and transactions involving the areas of accounting, finance, industrial relations, and organizational behavior. DISSERTATION ABSTRACTS ONLINE covers virtually every American dissertation accepted at accredited institutions since 1861.

A total of 604 abstract entries dealing with supervision were found through the manual literature search. 139 of these were found to deal with an applied work setting and evaluate supervisory styles in some manner. These were judged to be studies of possible use in the meta-analysis. The computer literature search produced an additional 70 listings of which 40 were judged to be worthy of fucther examination. These 179 studies are listed in Appendix B.

45 An effort was made to locate the desired studies to examine whether or not they could be included in the meta-'analysis. Twenty-four studies were dropped because they were dissertations and it was deemed financially unfeasible to try to obtain them. Upon further examination of the remaining studies it was found that only 20 articles were set in a work setting, examined human relations styles' of management, and contained the necessary statistics for calculating an effect size.

Coding of Study Characteristics

The collected articles were coded for certain characteristics which were believed to have possible

moderating effects on the findings. These characteristics can be classified into two broad areas, substantive and

methodological. The substantive characteristics which were coded for were: (1) whether the independent criterion was

productivity of the work group or effectiveness of the supervisor; (2) whether the study was situated in a

manufacturing, blue collar-service, or white collar-service

setting; (3) whether the study was published in the 1950s

or whether it was published after the 1950s; and (4)

whether the average work group in the study had more than

ten workers or whether the work group had ten or less

workers. The substantive characteristic 1950s vs

post-1950s was included because the human relations school

46 of thought became popular during the 1950s and it was thought that there might be a potential backlash away from

human relations management styles after the 1950s by

researchers. The substantive characteristic dealing with the work group size (also known as "tallness" of an organization) was examined to see if the size of a work group could affect the effectiveness of a "friendly style" of supervision, with the smaller work groups being more positively influenced than the larger work groups.

The methodological characteristics examined were; (1)

whether the sample size was under 100 or whether it was

over 100; (2) whether the independent criterion was a subjective measure or an objective measure; and (3) whether

there were multiple effect sizes in the study or whether

there was only one effect size per study. A listing of the sample studies with their coded characteristics appears in

Table 1.

Calculation of Effect Sizes

An effect size was calculated for each study based on

£. The majority of the studies used £ which was why this statistic and not d, which is the statistic preferred by

Glass, was used. Some of the studies conducted in the

1950s used epsilon as their statistical measure. Since

this statistic is relatively rare today, a transformation

to an £ value was not listed by any of the sources of

47 Table 1

Studies used in Meta-analysls

Type of^ ^ Average*^ Objective/^ Used Sample Effect Independent Job Group Subjective Combined Author(s) Size Size Criterion Site Size Measure Effect Sizes

Patchen 700 .33 M W/0 (1962)

Indik, 975 .15 BC-S > w/ Georgopoulos, & Seashore (1961)

Comrey, Pfiffner, 34 .60 SE WC-S w/ & Beem (1952)

Carp, Vitola, & 41 ,65 wc-s w/0 CD McLanathan (1963)

Nagle (1954) 208 .82 P WC-S > S w/0

Gekoski (1952) 200 .07 P WC-S < s w/0

Wilson, High, 163 .30 P BC-S < 0 w/ Beem, & Comrey (1954)

Argyle, Gardner, 90 .18 SE M w/ «e Cioffi (1957)

Dunteman & Bass 27 .42 SE WC-S w/0 (1963)

Bass (1958) 42 .32 SE M w/0 Type of ^ Average Objective/ Used Sample Effect Independent Job Group Subjective Combined Author(s) Size Size Criterion Site Size Measure Effect Sizes

Comrey, Pfiffner, 93 .19 SE M < W/ & High (1954)

Parker (1963) 1716 .13 P BC-S > 0 W/0

Comrey, High, 29 .37 P M 0 W/ & Wilson (1955)

Tjosvold, 310 .56 WC-S W/0 Andrews, & Jones (1983)

Russel & Farrar 507 .51 SE wc-s w/ (1978)

Student (1968) 486 .32 P M > 0 w/0

Taylor, Parker, 65 .13 SE WC-S S w/0 Martens, & Ford (1959)

Wilson, Beem, 1022 .33 BC-S > w/ & Comrey (1953)

Mowry (1957) 18 .72 SE WC-S w/ harmonic

Argyle, Gardner, 87 .24 P M < w/0 a P - Productivity used as independent criterion SE = Supervisor effectiveness used as independent criterion M = Manufacturing job site BC-S = Blue collar - Service job site WC-S = White collar - Service job site

c < = Group size less than or equal to 10 > = Group size greater than 10 = Could not be calculated with provided information ^ S = Subjective measure used in obtaining independent variable 0 = Objective measure used in obtaining independent variable ® W/ = Study which had combined multiple effect sizes into one summary effect size W/O = Study which had only one effect size

oi o meta-analytiG techniques (Hunter et al., 1982; Rosenthal/

1984; Glass et al., 1981). Therefore, the original statistical book. Statistical Procedures and their

Mathematical Bases, (Peters and Van Voorhis, 1940) was obtained and the conversion of the epsilon statistic to an

£ ratio was found. This £ ratio was then transformed into an £ value. This two step transformation is described in

Appendix c, equations la and lb.

When multiple effect sizes occurred in a study the conservative approach of combining the multiple effect sizes was taken. A method developed by Rosenthal and Rosen

(1986), was used to calculate a combined effect size and protect the independence of the rs, as well as preventing overestimates of the significance of effects (Guzzo, Jette,

& Katzell, 1985). The equations used and the values of the constants are described in Appendix c, equations 2a, 2b, and 2c.

Cumulation of Effect Sizes

The Hunter et al. (1982) method for cumulating effect sizes was used. This method allows for correcting of the error in sampling, the error of measurement, and restriction of range. Range restriction was not corrected for in this thesis project because the necessary information was absent. In addition, a ninety-five percent confidence interval was calculated for the mean effect

. 51 sizes of the total sample and all the subsampies. The equations used are pfesented in Appendix C, equations 3a through 3h. A chi-square test for heterogeniety of variance (Marascuilo/ 1971) was conducted when unexplained variance remained after corrections. This is the same test used by other meta-analysis researchers (Fisher and

Gitelson> 1983; Scott and Taylor, 1985). As a reviewer noted in the Fisher and Gitelson (1983, p.325) article,

"this test yields a more accurate approximation to chi-square than the similar test given by Hunter, Schmidt, and Jackson (1982)," In fact. Hunter and his colleagues

(1982) admit that their test is so powerful that it may identify even trivial amounts of unexplained variance as s;

When the amount of unexplained variance was found to be significant, the studies were separated into subgroups based on potential moderator variables. A chi-square test for heterogeniety of variance was again conducted as well as a test of contrasts between the relevant subgroups. The equations for these tests are listed in Appendix C, equations 4a and 4b.

A BASIC program was written by the researcher which performs the majority of the calculations described in this section. Only the corrections for measurement error and the contrast test between relevant subgroups were not

52 included in the program. The program is listed in Appendix

D.

53 RESULTS

The analysis of the data was originally conducted on the full set of 20 correlation coefficients. Table 2 lists the results of the initial analysis. Column 1 shows the particular variable of interest. The total sample size is given in column 2, and column 3 provides the number of effect sizes included in the analysis. The mean correlation weighted by sample size is provided in column

4, and column 5 shows the total observed variance in the sample correlations. The sampling error variance is contained in column 6, and the measurement variance is found in column 7. The value in column 8 is the

unexplained variance, which is the difference of the sample variance, the sampling error variance, and the measurement

variance. The final column is the results of the chi-square' approximation test used to determine the significance of the unexplained error. The lower portion of Table 2 provides the corrected mean effect size across studies, the corrected variance across studies, the corrected standard deviation across studies, and the 95% confidence limits for the corrected mean effect size.

It can be seen from the overall analysis of the full

54 dataset that Hawthorne styles of supervision have a positive effect on productivity as well as perceived supervisor effectiveness. However/ the 95% confidence limits range from -.06 to .62 which means that there is a possibility that the actual value could be near zero.

Table 2 shows that a large amount of the variance in the scores is still unexplained. This is supported by the significant chi-square value in column 9. These findings would seem to indicate that there are some variables which are moderating the relationship between Hawthorne styles of supervision and the independent criterion. Hunter et al.

(1982) state that a moderator variable will show itself in two ways: (1) the average correlation will vary from subset to subset, and (2) the average unexplained variance will be lower in the subsets than for the data as a whole.

The first moderator variable examined was whether the independent criterion was productivity or supervisor effectiveness. The results of this meta-analysis are provided in Table 3. The two mean effect sizes are different from each other (.26 vs .41) and the amounts of unexplained variance in the subgroups are less that that of the data as a whole. However, supervisor effectiveness seems to be more of a moderator in that its mean effect size is quite different from the mean effect size observed in the complete data set and very little of the variance is

55 ; -Table 2 ' Results of analysis of all the studies examining Hawthorne styles of supervision

Mean Sampling Total Sample Humber of Effect Sample Error Measurement Unexplained Chi- Size effect sizes Size Variance Variance Variance Variance Square

Hawthorne styles 6813 20 .28 .02899 .00249 .00913 .01737 327.78 of supervision ( 8.6%) (31.5%) (59.9%) p< .01 ^ Npte: Values in brackets are the percentages of the total variance which is being accounted for.

Hawthorne Styles of Supervision Corrected Mean Effect Size Across Studies .37 Corrected Variance Across Studies .02968 Corrected Standard Deviation Across Studies .17228 95% Confidence Limits -.06< vi <.62 Table 3

Results of analyses of studies using production as the independent criterion and those using supervisor effectiveness as the independent criterion

Mean Sampling Total Sample Number of Effect Sample Error Measurement Unexplained Chi- Size effect sizes Size Variance Variance Variance Variance Square

Production as the 5937 12 .26 .02664 .00175 .00771 .01718 276.66 criterion ( 6.6%) (28.9%) (64.5%) p < .01

(ji Supervisor effective- 876 8 .41 .02595 .00632 .01944 .00019 30.47 ness as the (24.4%) (74.9%) ( .7%) p < .01 independent criterion Note; Values in brackets are the percentages of the total variance which is being accounted for.

Supervisor Production Effectiveness Corrected Mean Effect Size Across Studies .34 .54 Corrected Variance Across Studies .03236 .01111 Corrected Standard Deviation Across Studies .17988 .10543 95% Confidence Limits -.OK y <.69 .33< w <.75 left unexplained (0.7%).

The next moderator variable examined was the type of job site at which the study was conducted (manufacturing, white collar-service, and blue collar^service). These findings are contained in Table 4. The results show that all the mean effect sizes are different from each other but only studies conducted at a manufacturing site and those done at a blue collar-service site have unexplained variance less than that of the data as a whole. This suggests that the moderator white collar-service site has a submoderator variable(s) which is/are influencing its effect on the independent criterion.

Another moderator variable examined was those studies conducted during the 1950s as opposed to those conducted after the 1950s. Table 5 gives the findings to this meta-analysis, and the results show that this factor is a moderator as well. The mean effect sizes are different from one another and the amount of unexplained variance is less that that of the total sample. However, there still remains a significant portion of unexplained variance in both categories, which leads one to conclude that the two categories are not strong moderators.

Table 6 provides the results of the moderator variable of work group size. Only 11 studies provided sufficient information to code for this variable. The median work

58 Table 4

Results of analyses of studies from manufacturing sites, white collar service sites, and blue collar service sites

Mean Sampling Total Sample Number of Effect Sample Error Measurement UnexplainedI Chi- Size effect sizes Size Variance Variance Variance Variance Square

Manufacturing site 1527 7 .30 .00241 .00377 .01014 .00 4.14

ns

Blue collar 3876 4 .19 .00765 .00096 .00417 .00252 33.24

Cn service site (12.5%) (54.5%) (32.9%) p < .01

White collar 1410 9 .49 .05007 .00365 .02731 .01911 136.75 service site ( 7.3%) (54.5%) (38.2%) p < .01

Note: Values in brackets are the percentages of the total variance which is being accounted for.

Blue Collar White Collar Manufacturing Service Service Corrected Mean Effect Size Across Studies .39 .25 .64 Corrected Variance Across Studies -.01965 .00431 .03266 Corrected Standard Deviation Across Studies 0.0 .06562 .18073 95% Confidence Limits p ® .39 .12< y <.38 .29< y <.99 Table 5

Results of analyses of studies conducted in the 1950s and those conducted after the 1950s

Mean Sampling Total Sample Number of Effect Sample Error Measurement Unexplained Chi- Size effect sizes Size Variance Variance Variance Variance Square

During the 1950s 2051 12 .34 .03702 .00460 .01951 .01291 164.10 (12.4%) (52.7%) (34.9%) ]P < .01

CTl 8 .26 .02362 .00145 .00771 .01446 146.53 O After the 1950s 4762 ( 6.1%) (32.6%) (61.2%) ;p < .01

Note: Values in brackets are the percentages of the total variance which is being accounted for.

During After the 1950s the 1950s Corrected Mean Effect Size Across Studies .44 .34 Corrected Variance Across Studies .03334 .02468 Corrected Standard Deviation Across Studies .18260 .15711 95% Confidence Limits .08< \i <.80 .03< p <.65 Table 6

Results of analyses of studies with an average work group size less than or equal to 10 and those with an average work group size greater than 10

Mean Sampling Total Sample Number of Effect Sample Error Measurement Unexplained Chi- Size effect sizes Size Variance Variance Variance Variance Square

Group Size less than 577 5 .21 .01824 .00791 .00294 .00739 12.77 or equal to 10 (43.4%) (16.1%) (40.5%) p < .05

cr> Group size greater 5107 6 .25 .02260 .00104 .00726 .01430 219.05 than 10 ( 4.6%) (32.1%) (63.3%) P < .01 Note: Values in brackets are the percentages of the total variance which is being accounted for.

Group size Group size <= 10 > 10 Corrected Mean Effect Size Across Studies .27 .33 Corrected Variance Across Studies .01263 .02444 Corrected Standard Deviation Across Studies .11240 .15633 95% Confidence Limits .05< y <.49 .02< y <.64 group size was 10, which is why that particular cutoff point was used. Both groups met the criterion of a moderator variable with work groups of less than 10 having more of a moderating effect on the independent criterion because of its greater accounting of variance.

Those moderator variables which are based on the methodological characteristics of the studies are examined next. The effect of the sample size {<100 vs >100) on the independent criterion seems to be minimal. The correlations of the two subgroups are almost equal and one of the subgroups has a larger amount of unexplained

variance that the population as a whole. An interesting side note is the size of the sampling error variance for

the two groups. Those studies with less than 100 subjects

had about 50% of their variance being accounted for by sampling error, while studies with over 100 subjects have only a minimal amount of sampling error variance. Although

this is the expected finding, it helps illustrate the need for adequate sampling when conducting experiments.

Another methodological moderator examined was whether

an objective or a subjective independent criterion measure

was used. Table 8 shows the findings from this

meta-analysis. The results support this variable as a

moderator of Hawthorne styles of supervision. The two mean

correlations are widely separated (.18 vs .38) and the

62 Table 7

Results of analyses of studies with sample sizes less than 100 and those with sample sizes greater than lOO

Mean Sampling Total Sample Number of Effect Sample Error Measurement Unexplained Chi- Size effect sizes Size Variance Variance Variance Variance Square

Sample size less 526 10 .30 .03121 .01571 .01014 .00536 23.71 than 100 (50.3%) (32.5%) (17.2%) p < .01

u> Sample size greater 6287 10 .28 .02876 .00135 .00913 .01828 303.90 than 100 ( 4.7%) (31.7%) (63.6%) p < .01

Note: Values in brackets are the percentages of the total variance which is being accounted for.

Sample size Sample size < 100 >100 Corrected Mean Effect Size Across Studies .39 .37 Corrected Variance Across Studies .00916 .03124 Corrected Standard Deviation Across Studies .09573 .17676 95% Confidence Limits .20< y <.58 -.07< y <.63 Table 8

Results of analyses of studies with objective independent criteria and those with subjective independent criteria

Mean Sampling Total Sample Number of Effect Sample Error Measurement Unexplained Chi- Size effect sizes Size Variance Variance Variance Variance Square

Objective independent 3410 6 .18 .00796 .00165 .00384 .00247 33.33 criterion (20.7%) (48.2%) (31.1%) p < .01

>1^ Subjective independent 3403 14 .38 .02909 .00299 .01667 .00943 193.16 criterion (10.3%) (57.3%) (32.4%) p < .01

Note: Values in brackets are the percentages of the total variance which is being accounted for.

Objective . Subjective Corrected Mean Effect Size Across Studies .24 .50 Corrected Variance Across Studies .00422 .01612 Corrected Standard Deviation Across Studies .06498 .12696 95% Confidence Limits .IK y <.37 .25< y <.75 unexplained variance is less than that of the overall population. However/ there still remains a significant amount of unexplained variance which leads one to believe that this Variable is not a moderator by itself.

The last moderator examined is contained in Table 9.

This analysis was to check to see if there was a significant difference in studies with a combined effect size as opposed to studies without a combined effect size.

The findings do not support this variable as a moderator.

The mean correlations/ although different/ are only slightly so (.27 vs .30). In addition/ one of the groups

had more unexplained variance than the data as a whole.

This leads one to reject combined versus uncombined effect sizes as an influence on the conducted meta-^analysis.

Contrast tests were conducted to examine any

significant differences between the effect sizes of the

moderating subgroups. Three comparisons resulted in significant differences. The moderating groups of

manufacturing site and white collar-service =4.61/ £ <

.0000/ 2-tailed)/ blue collar-service and white

collar-service iZ = 6.63/ £ < .0000)/ and work groups of 10

or less and work groups greater than 10 (^ = 1.98/ £ =

.0478/ 2-tailed) were found to differ significantly.

The findings of this study are based on the assumption

that the literature review has found all relevant studies.

65 Table 9

Results of analyses of studies without a combined effect size and those with a combined effect size

Mean Sampling Total Sample Number of Effect Sample Error Measurement Unexplained Chi- Size effect sizes Size Variance Variance Variance Variance Square

Studies without a 3882 11 .27 .03667 .00244 .00817 .02606 256.93 combined effect size ( 6.7%) (22.3%) (71.1%) p < .01

a\ Studies with a 2931 9 .30 .01842 .00255 .01014 .00573 70.54 combined effect size (13.8%) (55.0%) (31.1%) p < .01 Note; Values in brackets are the percentages of the total variance which is being accounted for.

Without With Combined Combined Corrected Mean Effect Size Across Studies .35 •^9 Corrected Variance Across Studies .04454 .00978 Corrected Standard Deviation Across Studies .21104 .09887 95% Confidence Limits -.06< y <.76 .20< y <.58 This does not take into account those studies which could

have had null results and were not published in a journal,

To examine the potential for this problem, Rosenthal's

"file drawer" equation was used (Rosenthal, 1984) to calculate the number of studies averaging null results

which would be required to reduce the findings of this study to nonsignificance. The results showed that 3544 studies with null results would be needed to reverse the findings of the overall meta-analysis.

67 DISCUSSION

The Hawthorne Findings

The conclusions reached by the Hawthorne experimenters were revolutionary when they were first published. The

Hawthorne studies illustrated how workers were not automatons who gave constant levels of output but who were effected by human emotions and motivational factors. These findings were the origin of the 'human relations' influence on U.S. management styles. The conclusions of the researchers at Hawthorne were not without their critics.

These criticisms ranged from critiques of the methodology employed to replacement of the original conclusions with new ones based on operant conditioning, learning models, etc. The purpose of this study was to examine those experiments subsequent to the publication of the Hawthorne findings to clarify the effects of friendly supervision on worker production or supervisor effectiveness as perceived by management.

The initial analysis on the 20 studies seems to have confirmed the impact of Hawthorne styles of supervision on productivity/supervisor effectiveness. However, given the

95% confidence interval listed in Table 2, the possibility exists that friendly supervision may have no effect on

68 production/supervisor effectiveness. In addition, when only those studies using productivity as the independent criterion were examined (these studies most closely resembling the conditions of the Hawthorne studies) it was found that the corrected mean effect size was only slightly smaller than that of the complete set of studies (all 20 studies ? = .37; studies which used productivity ? =

.34). Again, this seems to imply that Hawthorne styles of supervision do positively influence production, which is the "bottom line" for many organizations. However, the 95% confidence interval (shown in Table 2) does suggest that friendly supervision has the possibility of not effecting production.

The findings described above can be viewed as a confirmation for both the original Hawthorne studies and the criticisms of the Hawthorne studies. This is because friendly supervision does have a moderately positive effect on production, but there remains the possibility that friendly supervision has no relationship with production.

The findings of this thesis project would then seem to leave the conclusions of the Hawthorne reasearchers and their critics as muddled as ever. However, this is not the case. =

This thesis project has clarified many aspects of the relationship between friendly supervision and

69 productivity/supervisor effectiveness. One of the first areas of interests is the type of independent criterion used—:productivity as opposed to supervisor effectiveness.

It was found that supervisor effectiveness had a larger corrected mean effect size when compared to that of productivity (supervisor effectiveness ? = .54; productivity F = .34). If one assumes that those studies which used supervisor effectiveness as the independent criterion had the same productivity levels as those studies using productivity as the independent criterion, then this suggests that supervisors who use a Hawthorne style of supervision are rated as being more effective by management even though the productivity levels may have had little or no relationship with the supervisors' styles. An implication of this finding is that supervisors are evaluated on factors which are not entirely dependent on production rates. The process model of performance rating developed by Landy and Farr (1980) supports this finding.

Another aspect of this study was the finding of how the site of the study (manufacturing, blue collar-service, and white collar-service) demonstrated how the effectiveness of a style of supervision was dependent on the type of organization. White collar-service organizations were highly effected by Hawthorne styles of supervision, while blue collar-service and manufacturing were influenced to a

70 lesser degree. The difference between white collar-service and the other types of organizations were statistically significant as well.

This finding has an implication for future research.

Specifically, one could examine how different styles of supervision, covering the spectrum of supervisory styles

(i.e., autocratic to laissez faire), effect different job types. It could be possible that the optimum style of supervision is not the same for different job types. In addition, the different supervisory styles and job types could be examined while taking additional factors such as work group size, number of supervisors in the organizational hierarchy, or the level of unionization, into account.

The finding that work groups of 10 or less members had smaller corrected mean effect sizes than work groups of more than 10 workers (£ 10 ? = .27; > 10 ?= .33) goes against what was expected. Porter and Lawler's (1965) study examining organizational subunit size and job behaviors suggests that the relationship is negative or curvilinear. However, the findings of this thesis project were based on only 11 studies with a median work group size of 10. This is a rather limited sample given the range of the sizes of work groups in organizations, and therefore no definite conclusions concerning work group size can be

71 drawn from the current study.

When the independent criterion was objective the corrected mean effect size was smaller (F = .24) than when compared to a subjective independent crit;erion (F = .50).

Although these two effect sizes were not significantly different, the effect sizes suggest that objective and subjective measures may not be measuring the same constructs, or if they are, they are doing so differently.

This implies that rating bias is involved in the subjective measures of production and supervisor effectiveness.

The Meta-Analytic Method

Meta-anaiysis is becoming an established method of reviewing research in the social sciences. There are problems with the technique, but overall, meta-analysis is the most objective and thorough literature review method available. This study used the meta-analytic technique developed by Hunter and his colleagues. This method is known for its corrections of the following possible errors;

(1) sampling error, (2) measurement error, and (3) range restriction. Other meta-analytic techniques exist (cf.

Glass et al., 1981; Hedges and Olkin, 1982; Rosenthal,

1984) but at the time of this study there was no clear consensus for a preferred method. This lack of consensus should not be viewed as evidence of an inherent weakness of meta-analysis but rather as the refinement of a new

72 scientific instrument.

However, for the researcher attempting to make sense of an area of interest by using meta-analysis, these differences can be a cause of problems and concerns about the validity and future acceptance of the results of a meta-analysis in progress. Studies using Monte Carlo methods are currently being done in an attempt to find the strengths and weaknesses of the meta-analytic techniques.

Hopefully, there will soon be an aiccepted and proven method with which to conduct a quantitative literature review. Those who criticize meta-analysis on grounds of sample bias, improper coding of data, and biased interpretation do not appreciate some of the finer qualitites of the meta-analytic method. Not only does a meta-analysis review the literature with greater precision than is normally accomplished, but it allows scrutiny and criticism of the methods used by the person conducting the meta-analysis.

If something is found to be less-than-perfect about a meta-analysis, it is most likely due to the user of the technique and not the technique itself.

Bangert-Drowns (1986) summarizes the meta-analytic method this way;

Meta-analysis is not a fad. It is rooted in the fundamental values of the scientific enterprise: replicability, quantification, causal and correlational analysis. Valuable information is needlessly scattered in individual studies. The ability of social scientists to

73 deliver generalizable answers to basic questions of policy is too serious a concern to allow us to treat research integration lightly. The potential benefits of meta-analysis method seem enormous, (p. 276)

Summary

This thesis project was an attempt to confirm or refute the conclusions of the Hawthorne researchers. It involved the use of a quantitative literature review method known as meta-analysis/ and was conducted with 20 studies which examined aspects of styles of supervision at an actual job site. The findings indicate that there is a positive relation between friendly styles of supervision and productivity levels or rated supervisor effectiveness.

However, the lower limit of the 95% confidence interval was found to be slightly negative so the possibility exists that friendly styles of supervision have little or no effect on the independent criterion.

74 APPENDIX A

Earlier Criticisms of Hawthorne Research

Argyle/ M. (1953). The relay assembly test room in retrospect. Occupational Psychology, 27/ 98-103.

Bell, D. (1947). Adjusting men to machines. Commentary, 2' 79-88.

Bendix, R., & Fisher, L. N. (1949). The perspectives of Elton Mayo. Review of Economics and Statistics, 31, 312-321.

Bladen, V. W. (1948). Economics and human relations. Canadian Journal of Economics and Political Science, 14, 301-311.

Blumer, H. (1947). Sociological theory in industrial relations. American Sociological Review, 12, 271-278.

Dunlop, J. T. (1950). Framework for the analysis of industrial relations: Two views. Industrial and Labor Relations Review, 383-393.

Freeman, E. (1936). Social psvchology. New York: Henry Holt Co.

Friedmann, G. (1949). Philosophy underlying the Hawthorne investigations. Social Forces, 28, 204-209.

Hart, C. W. M. (1949). Industrial relations research and social theory. Canadian Journal of Economics and Political Science, 53-73.

Kerr, C. (1953). What became of the independent spirit? Fortune, 48, 110-111.

Knox, J. B. (1955). Sociological theory and industrial sociology. Social Forces, 33, 240-244.

Koivisto, W. A. (1953). Value, theory, and fact in industrial sociology. American Journal of Sociology, 58, 564-5C66.

75 Mills, C. W. (1948). The contributions of sociology to studies of industrial relations. Proceedings of the First Annual Meeting, Industrial Relations Research Association, 1^, 199-222.

Moore, W. E. (1947). Current issues in industrial sociology. American Sociological Review, 12, 651-657.

Moore, W. E. (1948). Industrial sociology: Status and prospects. American Sociological Review, 13, 382-400.

Schneider, E. V. (1950). Limitations on observation in industrial sociology. Social Forces, 28, 279-284.

Schneider, L. (1950). An industrial sociology—For what ends? Antioch Review, 10, 407-417.

Sheppard, H. L. (1949). The treatment of unionism in "managerial sociology." American Sociological Review, lA, 310-313.

Sheppard, H. L. (1950). The social and historical philosophy of Elton Mayo. Antioch Review, 10, 396-406.

Sorenson, R. C. (1951). The concept of conflict in industrial sociology. Social Forces, 29, 263-267.

Stone, R. C. (1952). Conflicting approaches to the study of worker-management relations. Social Forces, 31, 117-124.

76 APPENDIX B

Reference List of Examined Articles

Al-Jafary, A., & Hollingsworth, A. T. (1983). An exploratory study of managerial practices in the Arabian Gulf region. Journal of International Business Studies^ 14, 143^152.

Andrews, F. M., & Parris, G. P. (1967). Supervisory practices and innovation in scientific teams. Personnel Psychology, 20, 497-515.

Argyle, M., Gardner, G., & Cioffi, P. (1957). The measurement of supervisory methods. Human Relations, 10, 295-313.

Argyle, M., Gardner, G., & Cioffi, P. (1958). Supervisory methods related to productivity, absenteeism, and labour turnover. Human Relations, 11, 23-40.

Bacas, H. (1985). Who's in charge here? Nation's Business, 73, 57-62, 64.

Bachman, J. G. (1968). Faculty satisfaction and the dean's influence; An organizational study of 12 liberal arts colleges. Journal of Applied Psychology, 52, 55-61.

Bailey, J. K. (1956). The essential qualities of good supervision: A case study. Personnel, 32, 311-326.

Balk, W. L. (1967). Certain social psychological aspects of supervisory performance quantification in large work organizations. Dissertation Abstracts, 28(2-A), 813-814.

Bass, B. M, (1958). Leadership opinions as forecasts of supervisory success; A replication. Personnel Psychology, 11, 515-518.

Bayles, B. R. (1968). The interaction between certain personality variables and perceived supervisory styles and their relation to performance and satisfaction. Dissertation Abstracts, 28(11-B), 4788-4789.

77 Beer, M. (1965). Leadership, employee needs, and motivation. Dissertation Abstracts, 25(12, Pt. 1), 7365-7366.

Bolda, R. A. (1959). Employee attitudes related to supervisory and departmental effectiveness. Engineering and Industrial Psychology, 31-39.

Bolda, R. A. (1959). A study of employee attitudes and supervisor sensitivity to attitudes as related to supervisory and departmental effectiveness. Dissertation Abstracts, 19, 2133.

Bose, S. K. (1955). Employee morale and supervision. Indian Journal of Psychology, 30, 117-125.

Bryant, J. H. (1959). Types of supervisors and associated attitudes of subordinates. Dissertation Abstracts, 1086-1087.

Buchanan, P. C. (1958). Factors making for effective supervisory training. Personnel, 34, 46-53.

Burck, C. G. (1981). What happens when workers manage themselves. Fortune, 104, 62-69.

Campbell, H. (1953). Some effects of joint consultation on the status and role of the supervisor. Occupational Psychology, 27, 200-206.

Carp. F. M., Vitola, B. M., & McLanathan, F. L. (1963). Human relations knowledge and social distance set in supervisors. Journal of Applied Psychology, 47, 78-80.

Carron, T. J. (1964). Human relations training and attitude change: A vector analysis. Personnel Psychology, 17, 403-422.

Castle, P. F. C. (1957). The assessment of the effects of training courses for supervisors: A pilot study. Occupational Psychology, 31, 44-46.

Chaterje% A., & James, E. V. (1965). An investigation into the problem of effective supervision. Journal of the Indian Academy of Applied Psychology, 2, 8-12.

Clutterbuck, D. (1980). Management by anarchy. International management, 35, 30-32.

78 Cohen, L. (1968). Job satisfaction as a function of philosophy of work style among professionals and executives. Dissertation Abstracts, 29(3-B), 1196.

Comrey, A. L., High, W. S., & Wilson, R. C. (1955). Factors influencing organizational effectiveness. VII. A survey of aircraft supervisors. Personnel Psychology, _8, 245-257.

Corarey, A. L., Pfiffner, J. M., & Beem, H. P. (1952). Factors influencing organizational effectiveness. I. The forest survey. Personnel Psychology, 5, 307-328.

Comrey, A. L., Pfiffner, J. M., & High, W. (1954). Factors influencing organizational effectiveness. V. A survey of district rangers. Personnel Psychology, 533-547.

Crouter, A. C. (1982). Participative work and personal life; A case study of their reciprocal effects. Dissertation Abstracts, 42(12-A), 5264.

Davis, L. E. (1955). The supervisor and productivity. Journal of Personnel Administration and Industrial Relations, 56-74.

Davis, L. E., & Valfer, E. S. (1966). Studies in supervisory job design. Human Relations, 19, 339-352.

Day, R. C., & Hamblin, R. L. (1964). Some effects of close and punitive styles of supervision. The American Journal of Sociology, 69, 499-510.

Dent, J. K. (1958). Managerial leadership styles: Some dimensions, determinants and behavioral correlates. Dissertation Abstracts, 19, 563-564.

Digan, M. B. (1979). Hawthorne effect in learning by intact classes. Psychological Reports, 44, 12221

Doig, J. (1977). Where participative management works. Labour Gazette, 77, 168-172.

Dunham, P. (1976). Distribution of practice as a factor affecting learning and/or performance. Journal of Motor Behavior, 305-307.

Dunteman, G., & Bass, B. M. (1963). Supervisory and engineering success associated with self, interaction, and task orientation scores. Personnel Psychology,

79 13-21.

Edgecomb, T. S. (1967). The motivational consequences of task attributes and supervision. Dissertation Abstracts/ 28(3-B), 1159.

Edgerton, H. A., Peinberg/ M. R., & Thomson, K. E. (1957). Prediction of the "human relations" effectiveness of industrial supervisors. Personnel Psychology, 10, 421-432.

Ejiogu, A. M. (1983). Participative management in a developing economy; Poison or ? Journal of Applied Behavioral Science, 19, 239-247.

Esser, N. J., & Strother, G. B. (1962). Rule interpretation as an indicator of style of management. Personnel Psychology, 15, 375-386.

Etzioni, A. (1958). Democratic and nondemocratic supervision in industry. Journal of Human Relations, 6, 47-51.

File, Q. W. (1945). The measurement of supervisory quality in industry. Journal of Applied Psychology, 29, 323-337.

Fischl, M. A. (1957). Development of a human relations inventory for industrial supervisors. Dissertation Abstracts, 17, 396-397.

Fleishman, E. A., Harris, E. P., & Burtt, H. E. (1955). Leadership and supervision in industry; an evaluation of a supervisory training program (Bur. Educ. Res. Monogr. No. 33). Ohio: Ohio State University.

Flohu, J. C. (1977). A failure to demonstrate the efficacy of the Hawthorne Effect in an educational setting. Dissertation Abstracts International, 37(11-A), 7036-7037.

Foa, U. G. (1957). Relation of workers expectation to satisfaction with supervisor. Personnel Psychology, IJO, 161-168.

Gallegos, R. C. (1977). Effects on productivity and morale of a systems-designed job-enrichment program in Pacific Telephone. Psychological Reports, 40, 283-290.

80 Gekoski, N. (1952). Predicting group productivity. Personnel Psychology, 281-291.

Ghiselli, E. E. (1954). The forced-choice technique in self-description. Personnel Psychology/ ]_, 201-208.

Ghiselli, E. E. (1956). Role perceptions of successful and unsuccessful supervisors. Journal of Applied Psychology/ 40/ 241-244.

Ghiselli/ E. E./ & Lobahl, T. M. (1958). Patterns of management traits and group effectiveness. Journal of Abnormal Social Psychology/ 57/ 61-66.

Gilberg/ J. S. (1984). Participative management in comparative perspective: Attitudes and effects on committment among public and private sector managers. Dissertation Abstracts/ 46(04-A)/ 1030.

Good/ T. L., Grouws/ D. A, (1979). The Missouri mathematics effectiveness project: An experimental study in fourth-grade classrooms. Journal of Educational Psychology/ 11, 355-362.

Goodlad, J. B. (1980). 1984: An alternative. An experiment in employee participation. Management Accounting/ 58/ 19-20.

Gordon/ M. (1958). Leadership for production. Journal of Industrial Engineering/ 2' 420-423.

GriggS/ I. W, (1975). Harnessing the Hawthorne effect; The use of socio-psychological variables to improve training performance. Dissertation Abstracts International/ 36(4-B)/ 1957.

Grimala/ W. §. (1957). Evaluation of desirable characteristics of industrial supervision as reported by 1/899 hourly-classified workers. Dissertation Abstracts/ 17/ 1265-1266.

Gruenfeld/ L. w./ & Weissenberg, P. (1966). Supervisory characteristics and attitudes toward performance appraisals. Personnel Psychology/ 19/ 143-151.

Guion/ R. M./ & RobinS/ J. E. (1964). A note on the Nagle attitude scale. Journal of Applied Psychology/ 48/ 29-30.

Gutkin, T. B. (1978). Modification of elementary students'

81 locus of control; An operant approach. Journal of Psychology^ 100, 107-115.

Hall, C. (1984). Hawthorne effects—Still a potent supervisory tool. Supervision, 46, 6-7, 26.

Handyside, J. D. (1953). Two studies in supervision: Supervision in a cotton spinning firm (National Institute of Industrial Psychology No. 10), London, England.

Handyside, J. D. (1956). The effectiveness of supervisory training—a survey of recent experimental studies. Personnel Management, 38, 96-107.

Handyside, J. D., & Duncan, D. C. (1954). Four years later: A follow-up of an experiment in selecting supervisors. Occupational Psychology, London, 28, 9-23.

Harris, E. F. (1958). Measuring industrial leadership and its implications for training supervisors. Dissertation Abstracts, 18, 1513-1516.

Harris, E. F., & Fleishman, E. A. (1955). Human relations training and the stability of leadership patterns. Journal of Applied Psychology, 39, 85-91.

Hatch, R. S. (1962). An evaluation of a forced-choice differential accuracy approach to the measurement of supervisory empathy. Dissertation Abstracts, 23, 323-324.

Hausman, H. J. (1955). An investigation of the "halo effect" and over-all evaluation in supervisory ratings. George Washington University Bulletin, 55, 38-45.

Hausman, H. J., & Strupp, H. H. (1955). Non-technical factors in supervisors' ratings of job performance. Personnel Psychology, _8, 201-217.

Heller, F. A., & Xukl, G. (1969). Participation, managerial decision-making, and situational variables. Organizational Behavior and Human Performance, A, 227-241.

Higgins, R, B. (1972). Managerial behavior in upwardly oriented organizations. California Management Review, 14.

82 Howard, A., Shudo, K., & Umeshima, M. (1983). and values among Japanese and American managers. Personnel Psychology/ 36# 883-898.

Indik, B. P., GeorgopouloS/ B. S., & Seashore, S. E. (1961). Superior-subordinate relationships and performance. Personnel Psychology, 14, 357-374.

Israel, B. R. (1984). A triangulation study of the implementation of participative management in a large midwest school district. Dissertation Abstracts, 45(11-A), 3253.

Jennings, E. E. (1954). The dynamics of forced leadership training. Journal of Personnel Administration and Industrial Relations, 1, 110-118.

Jennings, E. E. (1955). Developing leadership states among supervisors. Personnel Journal, 33, 406-409.

Jerdes/ T. H. (1964). Supervisor perception of work group morale. Journal of Applied Psychology, 48, 259-262.

Johnson, J. M. (1958). Empathy, projection and job performance of plant supervisors. Dissertation Abstracts, 19, 355.

Jones, 0. R., & Smith, K. V. (1951). Measurement of supervisory ability. Journal of Applied Psychology, 15, 146-150.

Kancharla, G. (1969). Differences in managerial styles and motivational factors in government vs. private organizations of India. Dissertation Abstracts, 29(8-B), 2840-2841.

Katz, D. (1949). Employee groups: What motivates them and how they perform. Advanced Management, 14, 119-124.

Katz, D., Maccoby, N., Gurin, G., & Floor, L. G. (1951). Productivity, supervision and morale among railroad workers (Survey Reseach Center No. xii). Michigan: University of Michigan, Institute for Social Research.

Katz, D., Maccoby, N., Morse, N. C. (1950). Productivity, supervision, and morale in an office situation. Part . 1 (Survey Reseach Center No. vii). Michigan: University of Michigan, Institute for Social Research.

83 Katzell, R. A. (1948). The effectiveness of a training program in industrial psychology for supervisory personnel. American Psychologist/ 3, 329-330.

Kelly, J., & Khozan, K. (1980). Participative management; Can it work? Business Horizons, 23, 74-79.

Kennedy, J. E., O'Neill, H. E. (1958). Job content and workers' opinions. Journal of Applied Psychology, 42, 372-375.

Keppler, R. H. (1978). What the supervisor should know about participative management. Supervisory management, 23, 34-40.

Kidd, J. A., & Christy, R. T. (1961). Supervisory procedures and work-team productivity. Journal of Applied Psychology, 45, 388-392. King, D. C., & Clingenpeel, R. E. (1968). Supervis^ory effectiveness and agreement among superiors, supervisors, and subordinates regarding the supervisor's job behavior. Proceedings of the 76th Annual Convention of the American Psychological Association, 3, 559-560. ­

Kirchner, W. K. (1965). Relationships between supervisory and subordinate ratings for technical personnel. Journal of Industrial Psychology, 3^, 57-60.

Kirchner, W. K., & Reisberg, D. J. (1962). Differences between better and less-effective supervisors in appraisal of subordinates. Personnel Psychology, 15, 295-302.

Korman, A. K. (1963). Selective perception among first-line supervisors. Personnel Administration, 26, 31-36.

Kruglanski, A. W. (1969). Some variables affecting interpersonal trust in supervisor-worker relations. Dissertation Abstracts, 29(9-A), 3219.

Lawler, E. E. (1983). Human resource productivity in the 80's. New Management, 46-49.

Lawrie, J. W. (1966). Convergent job expectations and ratings of industrial foreman. Journal of Applied Psychology, 50, 97-107.

84 Lawshe, C. H., & Nagle, B. F. (1953). Productivity and attitude toward supervisor. Journal of Applied Psychology/ 37/ 159-162.

Likert/ R. (1956). Developing patterns in management. Personnel Management/ 38/ 146-161.

Lockwood, H. C./ & Parsons, S. 0. (1960). Relationship of personal history information to the performance of production supervisors. Engineering and Industrial Psychology/ 2, 20-26.

Lovrich, N. P. (1985). The dangers of participative management: A test of unexamined assumptions concerning employee involvement. Review of Public Personnel Administration/ _5/ 9-25. liOwin, A., & Craig, J. R. (1968). The influence of level of performance on management style: An experimental object-lesson in the ambiguity of correlational data. Organizational Behavior and Human Performance/ 3/ 440-458.

Maccoby, N. (1949). The relationship of supervisory behavior and attitudes to group productivity in two widely different industrial settings. American Psychologist/ £, 283.

Maier, N. R. (1968). The subordinate*s role in the delegation process. Personnel Psychology/ 21/ 179-191.

Main/ J. (1981). Westinghouse's cultural revolution. Fortune/ 103/ 74-93.

Malone, B. L. (1975). The Non-Linear Systems experiment in participative-management. Journal of Business/ 48, 52-64.

Malone, E. L. (1975). Non-Linear Systems, Incorporated—'An experiment in participative-management that failed. Management Review, 64, 36-43.

Mandell, M. M. (1949). Supervisors' attitudes and job performance. Personnel, 26, 182-183.

Mandell, M. M. (1956). Supervisory characteristics and ratings: A summary of recent research. Personnel, 32, 435-440.

85 Mann, F. C., & Likert, R. (1949). A case study of the process of applying research findings to administrative action. American Psychologist, 4, 283.

Markey, S. C. (1948). Opinions of subordinate enlisted men as measures of supervisor job proficiency. American Psychologist, 3, 260.

McClung, J. A. (1966). Dynamics of interaction patterns among industrial foreman. Dissertation Abstracts, 27(5-A), 1449.

McDaniel, C. 0. (1971). Extra-curricular activities as a factor in social acceptance among EMR students. Mental Retardation, 26-28.

McGehee, W., & Gardner, J. E. (1955). Supervisory training and attitude change. Personnel Psychology, 449-460.

Meyer, H. H. (1956). An evaluation of a supervisory selection program. Personnel Psychology, 499-513.

Misshauk, M. J. (1968). An investigation into supervisory skill mix among heterogeneous operative employee groups and the effectiveness in determining satisfaction and productivity of employees. Dissertation Abstracts, 29(3-A), 709.

Misumi, J., & Shirakashi, S. (1966). An experimental study of the effects of supervisory behavior on productivity and morale in a hierarchical organization. Human Relations, 19, 297-307.

Mosel, J. N. (1962). Incentives, supervision and probability. Personnel Administration, 25, 9-14.

Mowry, H. W. (1957). A measure of supervisory quality. Journal of Applied Psychology, 41, 405-408.

Nagle, B. F. (1954). Productivity, employee attitude and supervisor sensitivity. Personnel Psychology, ]_, 219-232.

Nealy, S. M., & Blood, M. R. (1968). Leadership performance of nursing supervisors at two organizational levels. Journal of Applied Psychology, 52, 414-422.

Nurick, A. J. (1982). Participation in organizational change; A longitudinal field study. Human Relations,

86 ^, 413-429.

Osterberg, W./ & Lindborn, T. (1953). Evaluating human relations training for supervisors. Advanced Management, 18, 26-28.

Parker, T. C. (1963). Relationships among measures of supervisory behavior, group behavior, and situational characteristics. Personnel Psychology, 16, 319-334.

Patchen, M. (1962). Supervisory methods and group performance norms. Administrative Science Quarterly, 7, 275-294.

Patterson, R. G. (1962). Coordinates of "popularity" of institutional work supervisors. American Journal of Mental Deficiency, 67, 29-32.

Pearson, W. W. (1968). Creating change in performance of supervisors in the electrical trades through human relations training. Dissertation Abstracts, 28(11-B), 4749.

Pelz, D. C. (1949)^ The effect of supervisory attitudes and practices on employee satisfactions. American Psychologist, 4, 283-284.

Penfield, R. V. (1966). The psychological characteristics among industrial foreman. Dissertation Abstracts, 27(5-B), 1610-1611.

Pfiffner, J. M. (1953). Research in organizational effectiveness. Public Personnel Review, 14, 49-54.

Pfiffner, J. M. (1955). The effective supervisor: An organization research study. Personnel, 31, 530-540.

Rainey, G. W., & Wolf, L. (1981). Plex-time; Short-term benefits; long-term...? Public Administration Review, 41, 52-63.

Remmers, H. H., & Pile, Q. W. (1943). Measurement and evaluation of supervisory quality in industry. Proceedings of the Indiana Academy of Science, 53, 149.

Reuans, R. W. (1981). Worker participation as action learning: A note. Economic and Industrial Democracy, 2, 521-541.

87 Ritti, R. R. (1964). Control of "halo" in factor analysis of a supervisory behavior inventory. Personnel Psychology/ 17/ 305-318.

Rogers, A. S. (1966). The modern look in motivation. Personnel Journal, 45, 290-293.

Roggema, J., & Smith, M. H. (1983). Wrganizational change in the shipping industry: Issues in the transformation of basic assumptions. Human Relations, 36, 765-790.

Rubenowitz, S. (1962). Job-oriented and person-oriented leadership. Personnel Psychology, 15, 387-396.

Sales, S. M. (1966). Supervisory style and productivity: A review and theory. Personnel Psychology, 19, 275-286.

Schlegel, R. P. (1977). Some methodological procedures for the evaluation of educational programs for prevention of adolescent alcohol use and abuse. Evaluation Quarterly, 657-672.

Schmitt, D. R. (1969). Punitive supervision and productivity: An experimental analog. Journal of Applied Psychology, 53, 118-123.

Schneiderman, M. H. (1977). Hawthorne effects in a remedial program for low achieving and learning disabled children. Dissertation Abstracts International, 37(8-B), 4121.

Schwatz, M. (1963). Style of rule enforcement and group effectiveness. Dissertation Abstracts, 23, 2609.

Slocombe, C. S. (1945). Teams of workers. Personnel Journal, 23, 284-298.

Smith, R. L. (1968). Communication correlates of interpersonal sensitivity among industrial supervisors. Dissertation Abstracts, 28(ll-A), 4743-4744.

Solem, H. R. (1958). An evaluation of two attitudinal approaches to delegation. Journal of Applied Psychology, 42, 36-39.

Sorensen, L. R. (1967). A study of supervisory leadership behavior, technical competence, and group

88 effectiveness. Dissertation Abstracts, 28(3-B), 1252.

Specter/ A. J., Clark, R. A., & Glickman, A. S. (1960). Supervisory characteristics and attitudes of subordinates. Personnel Psychology, 13, 301-316.

Speroff, B. J. C1961). Experimental procedures and techniques for evaluating supervisory performance. Personnel Administration, 24, 4-10.

Spillane, C. P. (1962). Evaluating a program in motivation training. Personnel Administration, 25, 50-54.

Srivastva, S., & Salipante, P. F. (1976). Autonomy in work. Organization and Administrative Sciences, 1_, 49-60.

Standifer, J. A. (1970). Effects of aesthetic sensitivity of developing perception of musical expressiveness. Journal of Research in Music Education, 18, 112-125.

Stanton, E. S. (1960). Company policies and supervisors* attitudes toward supervision* Journal of Applied Psychology, 44, 22-26.

Stephens, D. B. (1981). Cultural variation in leadership styles: A methodological experiment in comparing managers in the U.S. and Peruvian textile industries. Management International Review, 21, 47-55.

Stogdill, R. M. (1956). Interactions among superiors and subordinates. Sociometry, 18, 552-557.

Stogdill, R. M. (1967). The structure of organizational behavior. Multivariate Behavioral Research, Z, 47-61.

Strickland, L. H. (1958). Surveillance and trust. Journal of Personnel, 26, 200-215.

Stude, E. W. (1973). Evaluation of short-term training for rehabilitation counselors; Effectiveness of an institute on epilipsy. Rehabilitation Counseling Bulletin, 16, 146-154.

Student, K. R. (1967). Some organizational correlates of supervisory influence. Dissertation Abstracts, 27(10-B) 3701.

Student, K. R. (1967). Supervisory influence and work-group performance. Journal of Applied Psychology, 52, 188-194.

89 Tarnopol, L./ & Tarnopol/ J. (1956). How top-rated supervisors differ froin the lower-rated. Personnel Journal/ 34/ 331-335.

Taylor/ E. K./ Parker/ J. W./ MartenS/ L./ & Ford/ G. L. (1959). Supervisory climate and performance ratings; An exploratory study. Personnel Psychology/ 12/ 453-468.

Thumin/ F. J./ & Page/ D. S. (1966). A comparative study of two tests of supervisory knowledge. Psychological Reports/ 18/ 535-538.

Vollmer/ H. M./ & Kinney/ J. A. (1955). Supervising women is different. Personnel Journal/ 34/ 260-262. waddell/ R. L. (1952). supervisors and productivity. Supervision/ 14, 14-15/ 27.

Wager/ L. W. (1965). Leadership style/ hierarchical influence/ and supervisory role obligations. Administrative Science Quarterly/ 9/ 391-420.

Warr/ P. B./ Bird/ M. W./ & Hadfield/ R. W. (1965). A study of supervisors in the iron and steel industry. Occupational Psychology/ 39/ 235-245.

Warr/ P. B./ Bird/ M./ & Hadfield/ R. W. (1966). Some characteristics of foremen in iron and steel industry. Occupational Psychology/ 40/ 203-214.

Watson/ W./ & Michaelsen/ L. (1984). Task performance information and leader participation behavior: Effect on leader-subordinate interaction/ frustration/ and future productivity. Group and Organizational Studies/ ^/ 121-144.

Wilson/ T. P. (1968). Patterns of management and adaptations to organizational roles: A study of prison inmates. American Journal of Sociology/ 74/ 146-157.

Wilson/ R. C./ Beem/ H. P./ & Comrey/ A. L. (1953). Factors influencing organizational effectiveness. III. A survey of skilled tradesmen. Personnel Psychology/ 6_, 313-325.

Wilson/ R. C./ High/ W. S., Beem/ H. P./ & Comrey/ A. L. (1954). A factor-analytic study of supervisory and

90 group behavior. Journal of Applied Psychology^ 38/ 89-92.

Wilson, R. C., High, W. S., Beem, H. P., & Comrey, A. L. (1954). Factors influencing organizational effectiveness. IV. A survey of supervisors and workers. Personnel Psychology, 1_, 525-531.

Wilson, R. C., High, W. S., & Comrey, A, L. (1955). An iterative analysis of supervisory and group dimensions. Journal of Applied Psychology, 39, 85-91.

Woodman, R, W., & Sherwood, J. J. (1980). Effects of team development intervention; A field experiment. Journal of Applied Behavioral Science, 16, 211-227.

Wyndham, C. H., & Cooke, H. M. (1964). The influence of the quality of supervision on the production of men engaged on moderately hard physical work. Ergonomics, 7,; 139-149.

Zweig, J. P. (1966). The relationships among performance, partisanship In labor-management issues, and leadership style of first line supervisors. Dissertation Abstracts, 26, 4801.

91 APPENDIX C

Transformation of epsilon to an effect size

Peters and Van Voorliis (1940, p. 353) list the equation of transforming epsilon to an F-ratio as: la) (N-k) + (k-l) ^" (k-l)(l-e^)

where N is the size of the sample, and k is the number of groups.

The resulting F was then transformed to an r value using equation 2.17 in Rosenthal (1984): lb) F(l,-) r = J'(l,-) + df error.

where F(l,-) indicates any F with df = 1 in the numerator, and the dash represents df in the denominator.

Combining multiple effect Sizes in a study

The equation for a combined or composite effect size, e is found in Rosenthal and Rubin (1986, p. 402 as: c

2a) EX^t^/I {p(2X^.)^ + (l-p)ZAJ}

where I is the index of the size of the study, in this

92 study I was defined as {(n-I)/2}^; is the weight assigned to the ith dependent variable. Equal weights of 1 were uSed for this study. p is the typical intercorrelation among the dependent variables. The mean of those studies listing intercorrelations was .45 in this study. t^ is defined as;

2b) ^ rj^laC)'' V;;: ­ U-rp'^ '

The composite effect size was then expressed as a correlation by the equation:

2c) e_ -■ ■■ ■ ■ ■ ■ r ^ =■ ^ ^

c

where the df was N-2. T^hen unequal sample sizes were encountered the harmonic mean was used in the calculation of both the I and df Values.

Cumulation of effect sizes

The equation for calculating the mean population correlation, is from Hunter et al. (1982) :

- Z(N.r. ) : r = 1 1 ZN. '■ 1: ■

where N. and r. are the sample size and effect size ■ ■ . ; 1 X for the ith study. This value is then used to calculate

the variance across studies (Hunter et al., p. 41) which

93 is the frequency weighted average squared error;

ZN. 1

The error variance across studies is the average within Study variance given by the "almost perfect approximation" (Hunter et al., 1982, p.

® ■ ■ N

where K is the number of studies and N = ZN. or the '.i ■ ■ ■' ' total sample size.

The variance due to measurement error is calculated by the following equation (Hunter et al., 1982, p. 83) ;

'me = + E'ol) . 'Z (r — ■b.) ^ Where ^ if^ ' ~ ­ n r yy

where r.;^ is the reliability of the independent XX ' 1 . ■ , variables and r^^ is the reliability of the dependent variables. a is the square root of r and b is the square root of r . . ■ yy The true score mean correlation is found by calculating

(Hunter et al., 1982, p. 80):

94 3e) p - - _ _xv _ ^TU ab ab

The true score variance is found from (Hunter et al.,

1982, p. 80):

3f)

The standard deviation of the true scores is:

0- = (pi Ptu Ptu

The standard equation for calculating confidence intervals was used:

3h) PmTT - 1.96a- < u < p„„ + 1.96a— ™ "to - ™ "to

Test for heteroqenietv of variance

The chi-square test to test for heterogeniety of variance was calculated by the following equation

(Marascuilo, 1971, p. 452):

4a) k Uo = 2 (N - 3)(z - Zo)^ k=l ^ ^

approximates with k-1

degrees of freedom.

where k is the n\amber of correlations (transformed

95 into Fisher's z-scores); is the number of subjects in the kth study; is the z value for the kth study and Zq is the mean z score given by:

k E (N, - 3)z^ k=l ^ zo S (N, - 3) k=l ^

Post hoc contrast test

The statistical significance of the contrast, testing any specific hypothesis about the set of effect sizes, can be obtained from a Z computed as follows (Rosenthal, 1984, p. 84):

4b) EX.z

1:xr J

w. . J

where X. is the contrast weight determined from some J theory for any one study, such that the sum of Xj =0; z is the Fisher's z for any one study; w. is the inverse r. r J of the variance of the effect size for each study. For

Fisher's z transformations of the effect size r, the variance is l/(Nj - 3) so w^ is Nj - 3.

96 APPENDIX D

BASIC Program

The format of the data file is one value per line. The first two lines are: #1 Number of effect sizes in the data file; #2 Descriptive name of the data set. The values to be analyzed then follow, with one data value per line. The program was written for a Radio Shack TRS-80 64K Color

Computer 2.

CLEAR 1500 D=0:E=0:SM=0:SN=;0.•SV=0:XR=0:SNZ=0:SZ=0 EV=0:V=0:STD=0:XSZ=0:ZV=0:SMM=0 CLS INP0T"NAME OF DATA PILE";G$ INPUT'NAME OP OUTPUT FILE";H$ CLS PRINT"PUT IN DATA.CASSETTE AND" PRINT"POSITION DATA FILE" PRINT"PRESS " INPUT"PRESS TO CONTINUE";T$ OPEN "I",#-l, G$ IF EOF (-1) THEN END INPUT #-1, A, B$ DIM 1(A):DIM R(A):DIM N(A):DIM Z(A):DIM NZ(A) DIM I$(A):DIM R$(A):DIM N$(A):DIM Z$(A):DIM NZ$(A) DIM SSM(A) FOR X=1 TO A IF EOF THEN END INPUT #-1, I(X), R(X), N(X) I$(X)=STR$(I(X)):R$(X)=STR$(R(X)): N$(X)=STR$(N(X)) NEXT X CLOSE #-l PRINT"PUT IN OUTPUT CASSETTE AND" PRINT"POSITION TAPE" PRINT"PRESS " INPUT"PRESS TO CONTINUE;S$ OPEN "O", #-1, H$ PRINT f-l:PRINT #-1:PRINT #-l

97 ■A$.=" . ■ ■ " PRINT #-1, A$+B$ PRINT #-1;PRINT #-1:PRINT #-1 PRINT #^1,"INPUTTED DATA":PRINT #-l FOR X = 1 TO A E$= A$+"ID NUMBER" + I$(X) + A$ + "EPPECT SIZE" +R$(X) +A$+ "SAMPLE SIZE" + N$(X) PRINT #-1, E$;PRINT #-l .NEXT ,.x PRINT #-1:PRINT #-l PRINT #-1, "PRODUCTS OF EFFECT SIZE*N" FOR X = 1 TO A SMM(X) = N(X)*R(X) SM = smmCx)+sm SN = N(X) + SN SM$ = STR$(SMM(X)) PRINT #-l/ A$ + I$(X:) + A$ + SM$ NEXT X SM$=STR$(SM) SN$=STR$(SM) XR=SM/SN XR$=STR${XR) PRINT f-lr A$ + SM$ + "/" + SN$ FOR X = 1 TO A D = R(X)-XR E.- = :d**2 ' ■ SV = (N(X)*E)+SV NEXT X RV - SV/SN RV$ = STR$(RV) EV = (A*((1-XR**2)**2))/SN EV$ = STR$(EV) V = RV-EV V$ = STR$(V) STD = SQR(V) STD$- STR$(STD) PRINT #-l E$-A$+"MEAN EFFECT SIZE" + XR$ PRINT #-1, E$ PRINT #-l E$=A$+"VARIANCE IN EFFECT SIZE" + RV$ PRINT #-1, E$ PRINT #-l E$=A$+"VARIANCE DUE TO SAMPLING ERROR" + EV$ PRINT #-1, E$ PRINT #-l E$=A$+"CORRECTED VARIANCE IN EFFECT SIZE" + V$ PRINT #-1, E$ PRINT #-l E$=A$+"STANDARD DEVIATION IN EFFECT SIZE" + STD$

98 PRINT #-1, E$ SNZ=0 FOR X=1 TO A Z(X) = .5*L0G((1+R(X))/(l-R(X))) Z$(X)=STR$(Z(X)) NZ(X)=N(X)-3 NZ$(X)=STR$(NZ(X)) SNZ=NZ(X)+SNZ SNZ$=STR${SNZ) SZ=(Z(X)*NZ(X))+SZ SZ$=STR$(SZ) PRINT#-! E$= A$+"ID NUMBER" + I$(X) + A$ + "FISHER Z VALUE" + Z$(X) PRINT #-1, E$ + A$ + "VAR l/{N-3) " + NZ$(X) NEXT X XSZ=SZ/SNZ XSZ$=STR$(XSZ) PRINT #-!:PRINT #-1 PRINT #-1, A$+"SQUARED DIFFERENCES OF Z{X) AND MEAN Z" FOR X=1 TO A PZX={Z{X)-XSZ)**2 E$=STR$(PZX) PRINT #-l PRINT #-1, A$ + I$(X) + NZ$(X) + E$ ZV=NZ(X)*((Z{X)-XSZ)**2)+ZV NEXT X ZV$=STR$(ZV) SZ$=STR$(SZ) SNZ$='STR${SNZ) PRINT #-! PRINT #-1, A$ + "SUM OF DIFFERENCES "+ SZ$ + A$ + "SUM OF N "+SNZ$ PRINT #-! E$=A$+"MEAN FISHER Z" + XSZ$ PRINT #-!, E$ PRINT #-l E$=A$+"CHI-SQUARE SCORE" + ZV$ PRINT #-1, E$ DF$=STR$(A-1) Y$=STRING$(3,95) PRINT #-l E$= A$ + "FOR ALPHA = .05, CHI-SQUARE("+DF$+") =" PRINT #-1, E$+Y$ CLOSE #-l PRINT"D0 YOU WISH TO RUN ANOTHER DATA" PRINT"FILE? Y/N" INPUT C$ IF C$="Y" THEN 100 ELSE END

99 END

100 REFERENCES

BallingrS.S.7 Weiss, R.P., & Steigleder, M.K. (1985). Motivation at Hawthorne; Attention reconsidered, unpublished fnanuscript.

Bramel, D., & Priend, R. (1981). Hawthorne, the myth of the docile worker, and class bias in psychology. American Psychologist, 36, 867--878.

Bullock, R. J., & Svyantek, D.J. (1985). Analyzing meta-analysis: Potential problems, and unsuccessful replication, and evaluation criteria. Journal of Applied Psychology, 70, 108-115.

Carey, A. (1967). The Hawthorne studies; A radical criticism. American Sociological Review, 32, 403-416.

Cohen, J. (1977). Statistical power analysis for the behavioral sciences. New York; Acad^niic Press.

Peldman, J. (1982). Ideology without data. American Psychologist, 37, 857-858.

Pranke> R.H., & Kaul, J.D. (1978) The Hawthorne experiments; First statistical interpretation. American Sociological Review, 43, 623-643.

Glass, G.V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5, 3-8.

Glass, G.V., McGaw, B., & Smith, M.L. (1981). Meta—analysis in social research. Beverly HillsV Sage.

Green, B.P., & Hall, J.A. (1984) Quantitative methods for literature reviews. Annual Review of Psychology, 35, ■ 37-53. ■

Hedges, L.V. (1981). Distribution theory for Glass's estimator of effect size and related estimators. Journal of Educational Statistics, 107-128.

Hedges, L.V. (1982a). Estimation of effect size from a series of independent experiments. Psychological Bulletin, 92, 490-499.

101 Hedges, L.V. (X982b). Pitting categorical models to effect sizes from a series of experiments. Journal of Educational Statistics, 1_, 119-137.

Hedges, L.V. (1982c). Pitting continuous models to effect size data. Journal of Educational Statistics, 245-270. ^

Hedges, L.V. (1983). A random effects model for effect sizes. Psychological Bulletin, 93, 388-395.

Hunter, J.E., Schmidt, P.L., & Jackson, G.B. (1982). Meta-analysis; Cumulating research findings across studies. Beverly Hills: Sage.

Landsberger, H. A. (1958). Hawthorne revisted. Ithaca, NYi Cornell University.

Landy, P. J., & Parr, J. L. (1980). Performance rating. Psychological Bulletin, 87, 72-107.

Light, R.J., & Smith, P.V. (1971). Accumulating evidence! Procedures for resolving contradictions among different research studies. Harvard Educational Review, 42, 1361-1380.

Locke, E.A. (1982). Critique of Bramel and Priend. American Psychologist, 37, 858-859.

Mathieu, J.E., & Tannenbaum, S.I. (1983). Analyzing the analyses today. In S.I. Tannenbaum and R.J. Jones (Chair), Symposium Pourth Annual National I/O ^ G.B, Graduate Student Convention. Chicago, .

Mayo, E. (1933). The human problems of ^ industrial civilization. New York: Viking.

Mayo, E. (1945) The social problems of an industrial civilization. New York: Viking.

Orwin, R.G., & Cordray, D.S. (1985). Effects of deficient reporting on meta-analysis: A conceptual framework and reanalysis. Psychological Bulletin, 97, 134-147.

Parsons, H.M. (1974). What happened at Hawthorne? Science, 183, 922-932.

Parsons, H.M. (1978). What caused the Hawthorne effect. Administration and Society, 10, 259-283.

102 Parsons, H.M. (1982). More on the Hawthorne effect. American Psychologist, 37, 856-857.

Pennock, G.A. (1930). Industrial research at Hawthorne; An experimental investigation of rest periods, working conditions and other influences. Personnel Journal, 8, 296-313.

Pitcher, B.L. (1981). The Hawthorne experiments; Statistical evidence for a learning hypothesis. Social Forces, 60, 133-149.

Roethlisberger, P.J. (1941). Management and morale. Cambridge; Harvard University Press.

Roethlisberger, F.J., & Dickson, W.J. (1939). Management and the worker. Cambridge; Harvard University Press.

Rosenthal, R. (1978). Combining results of independent studies. Psychological Bulletin, 85, 185-193.

Rosenthal, R. (1979). The "File drawer problem" and tolerance for null results. Psychological Bulletin, 86, 638-641.

Sawyer, J. (1966) Measurement and prediction, clinical and statistical. Psychological Bulletin, 66, 178-200.

Schlaifer, R. (1980). The relay assembly test room; An alternative statistical interpretation. American Sociological Review, 45, 995-1005.

Schmidt, F.L., & Hunter, J.E. {1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62, 529-540.

Schmidt, F.L., Hunter, J.E., Pearlman, K., & Hirsh, H.R. (1985). Forty questions about validity generalization and meta-analysis. Personnel Psychology, 38, 697-798.

Shepard, J.M. (1971). On Alex Carey's radical criticism of the Hawthorne studies. Academy of Management Journal, 14, 23-32.

Sonnenfeld, J. (1982). Clarifying critical confusion in the Hawthorne hysteria. American Psychologist, 37, 1397-1399.

Stagner, R. (1982). The importance of historical context.

103 American Psychologist/ 37# 856.

Strube, M.J.# Gardner# W.# & Hartmann# D.P. (1985). Limitations# liabilities# and obstacles in reviews of the literature; The current status of meta-analysis. Clinical Psychology Review# ^# 63-78.

Tannenbaum# S.I.# & Jones# R.J. (1983). Meta-analysis: Overview and point of departure. In S.I. Tannenbaum and R.J. Jones (Chair Symposium Fourth Annual National I/O ^ O.B. Graduate Student Convention. Chicago# Illinois.

Toch# H. (1982). Sed qui dicit# nori qui negat? American psychologist# 37# 855.

Vogel# W. (1982). A judeo-christian rejoinder to Bramel and Friend's marxist critique of the capitalist "Hawthorne effect." American Psychologist# 37# 859-860. Wardwell# W.I. (4979). Critique of a recent professional "put-down" of the Hawthorne research. American Sociological Review# 44# 858-867.

Whitehead# T.N. ,(1936). Leadership in a free society. Cambridge: Harvard University Press.

Whitehead# T.N. (1938). The industrial worker. (Vols 1-2). Cambridge; Harvard University Press.

104