Sub to AMR

RETHINKING ASSESSMENT AS PRACTICE:

NOTES ON MEASURES, TESTS AND TARGETS

Brigitte Jordan

Palo Alto Research Center

3333 Coyote Hill Road

CA 94304, Palo Alto

(650) 747-0155

[email protected]

Peter Putz

Johannes Kepler University Linz

Altenbergerstr. 69

A-4040 Linz, Austria

+43 732 2468 9447

[email protected]

Last changes: June 14, 2002

Acknowledgments There are many individuals who read earlier drafts of this paper and provided consequential critique and support. We owe particular thanks to Jim Greeno, Robert

Bauer, Victoria Belotti, Carole Browner, Danielle Fafchamps, Bob Irwin, Brigitta Nöbauer,

Mike Perdue, Ron Simons, Judy Tanur, Roxana Wales and Marilyn Whalen. They are, of course, somewhat responsible for the ideas we present here.

1 Abstract Formal, “documentary” assessments are ubiquitous in modern organizations.

Yet, their frequently occurring, unintended negative consequences for work practices, for corporate decision-making, and for the structure and culture of an organization remain largely unexamined. Distinguishing inherent, discursive and documentary assessment types, we identify a range of suboptimal outcomes associated with inappropriate reliance on documentary assessment methods and propose a novel three-part framework that provides recommendations for managers and researchers for the improvement of assessment practices.

2 There are two equally important observations that emanate from our work on learning and assessment at the Xerox Palo Alto Research Center and the (now defunct) Institute for

Research on Learning:1

 assessment is a normal, ubiquitous part of all social interaction;2

 formal assessment methods as used in organizations frequently lead to undesirable

behavioral results.

These observations are not of a kind. They are like dinosaurs and humans, both bipedal but otherwise occupying rather distinct life spheres. Nevertheless, we might learn something by considering them together. It may just be the case that part of the problem with assessment is precisely that the first observation has not been taken seriously. In this paper we will talk about both and attempt to draw out their importance for rethinking assessment in workplaces and schools. The central proposition here is that, in addition to the standard tests and performance measurements that are routinely administered in our institutions, there are other important forms of assessment that are not usually recognized. These can give us important insights and provide leverage for restructuring the way assessment systems are designed.

What emerges out of our work is a framework that complements and amplifies recent thinking around measurement practice, puts formal assessment in perspective and recognizes it as only one piece (albeit an important one) of the formal and informal judgments about performance that play a crucial role in schools, work places, and everyday life.

We start our discussion with a characterization of two kinds of assessments that are produced “on-the-fly” as natural parts of mundane social activities by individuals and groups: inherent assessments and discursive assessments. We then contrast these with formal, standardized measurements used in organizations, which we call documentary assessments. A comparison of all three kinds of assessments summarizes their respective features and allows us

3 to develop a fresh perspective on the practice of assessments in organizations.

INHERENT ASSESSMENT

We note that some form of assessment is a natural part of all socially situated activities.3

Informal assessments occur all around us, whenever human beings get together to accomplish any sort of activity in a collaborative, cooperative way. Whenever we take the time to look and notice, we find naturally occurring, unremarkable and unremarked assessment activities that nevertheless are fundamental in making human collaboration possible. Beyond collaboration, and even more fundamental, is the fact that in-a-glance, on-the-fly mutual assessments underlie all of human sociality and in fact the sociality of all social species. For every creature that depends on interaction with others of its kind, to be able to continuously and unproblematically ascertain the state of others in their environment is a fundamental requirement for survival. Input from these assessments is deeply linked to emotional states, to fight-or-flight responses, nurturing and aggressive behaviors. Some examples of inherent assessments:

• In Conversation: A listener "looks puzzled" -- the speaker rephrases what she just said. Both individuals have made an assessment. Such mutual checks on understanding are continuously carried out when people talk to each other formally, as in speeches and interviews, or informally in casual conversation (C. Goodwin, 1986; Goodwin & Goodwin, 1987; M.

Goodwin, 1980; Pomerantz, 1978, 1984).

• Mothers and Children: Mothers are constantly called upon to assess their children's acquisition of various skills - as are other family members, particularly siblings who are keen judges of younger sisters' and brothers' acceptability in play groups (or work groups in other cultures). A mother makes dozens of such judgments daily, i.e. when she gives the child a spoon rather than a fork with which to eat, when she does or does not let her cross the street

4 alone, let her go to the neighbor's house, or tell her to get her shoes tied (or not).

On a videotape,4 we observe the routine morning activities of a family. The tape shows us mother, four-year old daughter, and baby Amy baking muffins for breakfast. Daughter is standing on a chair filling muffin cups, mother watching, baby crawling up and pulling herself up to the table, looking on. Intense monitoring is going on of everybody by everybody. The mother at one point holds down the paper muffin cup with thumb and finger because, as daughter puts in heavy, sticky batter with a wooden spoon, the batter doesn't loosen from spoon and pulls up the cup. Next time, daughter holds down the cup the same way. Now some batter is on her finger. She licks it off (which could be seen as cleansing it) with a quick sideways look at mother. Mother lightly dips her finger into the muffin cup they've just been working on and licks it appreciatively, sort of showing that it's okay to have a lick for the taste and not just for the cleaning. At that point the daughter sticks her finger into the bowl and licks it and baby Amy who's been watching the proceedings with great interest also gets her fingers into a muffin cup.

What we see here is a constant, ongoing assessment of activities, of what's okay and what isn't, of how to handle spills, of when a cup is “full enough.” The monitoring is mutual, by the mother of the children in order to keep them appropriately integrated in the activity and by the children of the mother in order to acquire the necessary skills for carrying out the tasks at hand.

And the baby is tracking it all.

• Teachers and Students: In a classroom we might see a teacher wandering over to a child she expects to have trouble with an assignment. She takes one look at what the child is doing, and wanders back to her desk, having judged the child to be doing okay. Students as well make judgments about each other's competence, an assessment that becomes apparent in the ways a joint problem is attacked. In contrast to externally imposed tests, these endogenously generated assessments are inherent in the social scene of ongoing classroom activities. They are

5 generated by participants in the course of those very activities.

• Professional Judgment: A Maya village midwife is called to the hut of a woman in labor. As soon as she arrives with her apprentice, a messenger calls her to another birth. She turns to the apprentice and says: "You stay here and take care of Dona Rosa; I'll go and see about

Dona Elbia."5 An assessment has occurred. Without saying it in so many words, the senior midwife has judged her apprentice competent to conduct a birth on her own. Notably, this assessment occurs in a real-life situation under the pressures and requirements of getting real work6 done. There is no abstract claim to expertise here, nor any formal conveyance of authority, but rather, in the urgency of the situation the apprentice is judged to be competent. She is immediately called upon to demonstrate that competence -- not abstractly, in a written examination, but in the conduct of a real birth, in an environment where she will be judged as to her ritual, manipulative, herbal, medical and interactional competence by experts -- other women who have had children and who also attend the woman in labor.

• From Novice to Expert: On a videotape of an airline operations room we observe a supervisor and four operators, one of whom is new on the job. At one point the supervisor propels himself on his wheeled office chair from his own workstation into the midst of the four operators, with his back to the rookie. The supervisor is engaged in looking at a bank of video monitors, which are more easily scanned from that position than from his desk. After a while he asks softly, of no one in particular: "Has 464 landed?" The new operator punches something into her terminal and then answers: "They're about to land." The supervisor nods and continues to make annotations on a sheet of paper.7

Again, an assessment has happened. What the supervisor -- and the rookie's co-workers who overhear the interchange -- find out is that the new operator not only knows how to get the required flight information (which the supervisor could have found out by checking his own

6 computer or asking her directly) but also that she is in command of a much higher level skill: to judge correctly when it is up to her to provide answers to generally posed questions. The assessment was accomplished by the supervisor in the course of, and incidental to, acquiring a useful piece of information, not something contrived for purposes of testing.

We call these embedded-in-life-activity assessments “inherent assessments” because they are a chronic feature of all human conduct in the ongoing flow of activities8. They occur routinely, effortlessly and unavoidably as part of any non-solitary human activity where people rely on a shared sense of purpose. Thus we find inherent assessments as part of routine work activities as well as in the normal home-, family-, sports-, and recreational activities of human beings. As a matter of fact, all of us make assessments of each other all the time, assuming that we each can or can not, will or will not, do certain sorts of things. Furthermore, we are constantly updating our assessments in the light of ongoing activities. Inherent assessments are not usually made explicit in the form of verbal utterances or mental descriptions but tend to remain in the sphere of practical consciousness.

In everyday social life as well as in work situations, inherent assessments are made in the interest and for purposes of the individual attempting to align (or misalign) with the group. As such they constitute one of the fundamental mechanisms by which learning occurs, including the kinds of incidental learning we simply think of as normal parts of human development. Even a baby or toddler continuously assesses approval or disapproval of its actions by family members.

Newcomers adjust their talk and nonverbal interactions to those of a workgroup they are entering. Neophytes become full members of communities of practice by quickly and unobtrusively monitoring responses and reactions of other members (and thereby gaining access to the group norms they need and want to adopt).

Work groups are concerned pragmatically with task accomplishment within an ongoing

7 work- and life space; with newcomers fitting in, with old-timers aligning, adjusting, generating divisions of labor, and the like. Their assessment criteria are based on the affordances of ongoing situated activities rather than on underlying abstract skills. Inherent assessments occur spontaneously because they fulfill a necessary function in people's coordination with each other.

People need them in order to align themselves with the group, to fit into the dance. It is the mutual assessment that tells them they are ready (or not) to take the next step and do the next task. Though tacit and implicit, inherent assessments are absolutely crucial for smooth, interpersonal interaction and for carrying out the work of a community of practice or any other social formation.

DISCURSIVE ASSESSMENT

Sometimes, in the course of an activity, participants may find a reason to make the unspoken, inherent assessment explicit. Parents may talk to their children about what they are already able to accomplish and what they will be able to do soon. A workgroup may begin to talk about how they are doing, how much more remains to be done, that a particular worker is lagging and why, the impact of defective parts on the speed of the assembly line, or the effect of a plane delay on activities in an airport. When assessment becomes explicit and shared within the group, we call it "discursive assessment."9

For example, if we listen in on the talk in an airline’s “ops room” (the control station from which ground operations are monitored and guided), we find ops talk chock full of statements that indicate to insiders where they are in the complicated process of taking airplanes in and out. We might see a pilot leaning over the counter that separates the ops work area from a break room in which mechanics, “ramp rats” and other personnel are taking their breaks. He says to one of the operators (but loud enough so everybody in both areas can hear): “They are

8 changing the oil filter on number one.” The operator, nodding, responds “right” and the pilot goes on: “I asked them if this would involve a delay? They said it's gonna be close. So I thought you might like to know and the gate might like to know also. …” He trails off. The operator nods repeatedly while the pilot talks. Then he says: “Great! Uh, is this a … what’s the problem involved?” To which the pilot responds by explaining the technical reason for the trouble (“The bypass pin has popped out. Meaning the filter is contaminated”) and what is being done about that. This kind of assessment of the state of the work, the implications of which (who else needs to act? who needs to be notified? what happens if the problem can’t be fixed?) are contextually clear to all those present, or become a topic for further talk and negotiations to determine what needs to be done. They are not only exceedingly common but also absolutely essential for a smooth workflow.

Discursive assessments -- like inherent assessments -- are generated within the group to figure out, collaboratively, what state the group is in and what to do about that. The difference is that while inherent assessments rely on individual nonverbal monitoring, resulting in individual behavioral adjustment, discursive assessments make issues public, propose common standards, suggest and enforce divisions of labor and monitor group behavior such that the work will get done. They generate information and documentation not only for internal use but also as a way to justify the group’s actions, document its successes, or account for its failures externally.

Note that discursive assessments are socially mobile. They can be referred to, doubted, agreed with, or revised by people who are part of the group. They have become "social objects" that have a life within the group. Unlike inherent assessments, they have persistence. People can point to them at a later time; they can be passed around, discussed, and modified. They are group property, used by members of the group individually and collectively to advance the group’s enterprise.

9 We saw in a prior section that becoming a competent member of a workgroup requires learning to carry out inherent, tacit assessment procedures. But that is not all. Though unacknowledged, the ability to talk about the ongoing activity in an evaluative way, i.e. to produce a discursive assessment, is equally crucial. By showing competence in group- endogenous assessment activities, members, and particularly new members, demonstrate their proficiency and gain acceptance. Discursive assessments are exceedingly common in work situations where a continuous, real-time evaluation of a particular situation or procedure is imperative in order to update all parties involved about the current state of their world. Routine work is filled with huge numbers of these.10

Discursive assessment, the reflective talk generated by a group of people engaged in a particular activity about that activity is an important vehicle for learning and innovation. Group- internal discursive assessment functions to facilitate reflection and thereby plays a significant role in learning. It creates a shared understanding of individual roles and responsibilities and thereby works out a division of labor. At the same time, these informal assessments create a public verbal representation of the capabilities, resources and issues for a group that enable them to consider implications of the current state, as they understand it, for behaving more effectively.

Like inherent assessments, they are often produced on the fly, effortlessly, without official training, as part of the ongoing activities of schools, workplaces and life spaces. This can happen in the hallway or the coffee shop as readily as during dedicated work activity.11

DOCUMENTARY ASSESSMENT

The third type of assessment is the one we are all most familiar with. It involves externally mandated, stable symbolic representation of an evaluation in the form of tests, surveys, checklists, plans, targets and similar instruments. We call this "documentary

10 assessment."

Documentary assessment occurs when an enduring record of some kind is produced, a set of marks on a piece of paper (or on a computer), documentation that is reflective and evaluative of some activity. In inherent and discursive assessment no independent record gets created; the assessment just gets folded back into the activity, is a part of the activity, molds the activity, but is not externally available. In documentary assessment, on the other hand, we see an extension of the mobility of the assessment beyond the group. It has the properties of "immutable mobiles"

(Latour, 1986), that is to say, once constructed, the content becomes fixed while the assessment itself becomes mobile as a document. Unlike discursive assessments whose life is restricted to the group that spawned them, documentary assessments exist independent of the situation in which they were generated. They become public documents that can move within a larger socio- economic system.

Mobility and immutability are the features that allow distant actors to draw upon and appropriate documentary assessments. In inherent assessment, every participant acts on her or his individual evaluation of the situation. In discursive assessment, it becomes group property. In documentary assessment, the record becomes mobile beyond the group out into the larger system and possibly the world. It is no longer carried out simply and naturally as an integral, necessary part of working or learning or recreational activities but becomes available for other functions. In most cases, a documentary assessment is imposed from the outside, carried out in the interest of some superordinate group, generally to further interests that do not directly overlap with those of the group being assessed.

Documentary assessment occurs primarily to evaluate the extent to which pre-established performance targets have been achieved and to establish cross-group comparability. In schools, a teacher's entries in her grade book or students' scores on standardized tests are less designed to

11 facilitate students' learnings than to satisfy the requirements of ranking this student amongst others (the teacher's job) or ranking this school amongst schools in the district and the state (the school superintendent's job), carried out to respond to the interests of political constituencies and not primarily to the interests of the students engaged in the business of learning.

For distant users documentary assessments are means to grasp what is going on in remote places and to intervene there. Comparison is the most salient operation in this context. For the sake of comparison, the great variety of real-world phenomena has to be reduced. If we take an ongoing, lived experience and if we then produce a written representation of that (fill out a form, answer a questionnaire), certain consequences ensue. We are no longer operating in the lived world of experience but within a symbol system. This is known as a translation of an experiential analog to a symbolic digital system, which always involves loss of contextual information. The loss can be significant. If the teacher gives a student a "C" or the supervisor assigns a

"satisfactory" to a worker's performance, this makes that particular entry comparable to others but shears it of its local context. Did the operator deal with a particularly difficult customer?

Was the workgroup involved in training efforts that took time away from production? How does this compare to what the student or worker is capable of? The assessment provides a number or a symbol. It says "C" or "satisfactory." No more, no less. Whoever receives this information is left ignorant about the why or how, and what this could mean for getting the work done.

But this very loss of contextual information inherent in the production of documentary records buys us two important features: data consistency and expanded distribution. The enormous potency of documentary assessments for the coordination and control of organizations is based on this ability to gather consistent, thus comparable data from distant subunits to any place where decisions are made. Stripping the context is precisely what makes comparison possible without regard to individual circumstances and this is why documentary assessment is

12 always called for when “objectivity” is an issue. As records become public beyond the group of origin, they become available for others, be that a person or an organization or institution, to come to some conclusion about how well things have gone in that particular working/learning system. In documentary assessment, the intention is to move assessment from the hands of interested parties to some sort of formal, objective procedure. We now require instruments that are insensitive to differences in ethnicity, gender, social status, work and learning environment, personal histories and relationships, and all other potential sources of bias. How is that possible?

One distinction between naturally occurring (inherent and discursive) assessments and formal documentary assessments is that the former include an affective component while the latter specifically exclude that, i.e. participants do not share the same Lebenswelt. Expertise, as judged via endogenous assessment is based on co-experience, while documentary assessment tends to be based on externally relevant standards.

Documentary assessments facilitate coordination of different, partially conflicting interests among the parties involved in an organization: employees in various managerial and operational positions, owners, governmental agencies, customers and suppliers. A particular assessment result, e.g. the number of students enrolled in a specific program, carries quite different meanings and calls for different actions whether the number is used by a resource allocating board, a designer of a curriculum, a professor in choosing teaching designs or a potential student deciding whether to join the program or not. Yet it unifies them by having a clear point of reference. By virtue of the shared symbol system documentary assessments employ

(consisting of numbers or standard evaluative statements) they become linking pins between diverse stakeholders involved in corporate and public affairs. In practice, assessments become

“boundary objects”, objects that are plastic enough to adapt to local needs and the constraints of the several parties employing them, yet robust enough to maintain a common identity across sites

13 (Star, 1989: 393).

In a deeper sense, the choice of questions to be asked and variables to be measured via these assessment instruments makes certain aspects of an organization’s activities visible and hides others. The very structure of the assessment instruments and the manner of their administration can be seen as indicative of an applied theory about how a particular organization sees itself working. In this sense documentary assessments are manifestations of a management rationale, providing a complex set of vocabularies and meanings that are linked to the educational, political and economic concepts the organization is devoted to (Miller, 1994).

The work that documentary assessments do for the control and coordination of organizations cannot be sufficiently understood without taking into account the perspective of frontline workers. They usually understand that some degree of comparability in the productivity of local workplaces is necessary for successfully running a company and they have some tolerance for measuring to what extent predefined performance objectives have been reached.

However, they also face the problem that many of the measures they are required to produce do little to help them do a better job or learn more effectively. In the workplace, the mutual adjustments made possible through inherent assessment and the shared understanding generated by discursive assessment are now joined by the requirements of accounting for success or failure externally, i.e. measuring performance as conceptualized by local and corporate managers.

Workers also suspect that management might draw on these kinds of data to justify decisions about resource allocation, plant closings and the fate of projects and individual workers. When performance data get used to punish and reward a strong motivation is generated to manipulate the numbers. The game becomes one of making the numbers look good rather than improving the learning or work process. The fact that documentary assessments are carried out in the interest of external constituencies has effects on the way data are produced, work practices

14 are changed and relationships of collaboration, openness and trust are affected. It is this fragile balance between usually unintended dysfunctional impacts and the beneficial features of documentary assessments that makes them such a tricky yet consequential means in the effort to coordinate organizations.

FROM INHERENT TO DISCURSIVE TO DOCUMENTARY ASSESSMENT:

AN EXTENDED EXAMPLE

Let us take a look at a real-world situation where assessment activities are rampant (as they are in all work situations).

A group of farm workers is harvesting onions.12 Each worker goes down a row, pulls up a bundle of onions by the withered greens, holds the bunch over a burlap bag and cuts the greens with a pair of clippers so that the onions fall into the sack. As the sack gets full, the worker carries it to the end of the row where it will be counted by the foreman and picked up by a truck.

Now, at any time during the work, be it for purposes of coordinating a break with a friend or for seeing who's ahead or who's not keeping up and might need help, a sweeping glance over the field is all that's necessary for an assessment. If they all started at the edge of the field, their very position shows if they are "ahead" or "behind." In addition, the quantity of bags at the end of rows provides an immediate measure (in volume, even if not counted) of such things as how far along they are towards the next pickup. These assessments are inherent to the activity; you couldn't be a team member without making them and they provide valuable resources for the sequencing of subtasks, for energy expenditure, for judgments about a person's strength, state of health, needs for money, and the like.

At some point, assessment becomes discursive. One of the workers suggests that if they each did seven more sacks they would be done with the field. At that point, people begin to count

15 how much they've done already and estimate how many more onions are out there to be picked.

Now assessment becomes a topic for discourse in the community; judgments are made, questioned, negotiated, but note that this is done entirely for purposes of the group.13

Now the foreman comes and writes down numbers for each crewmember. At this point, a set of marks is created which shows characteristics of its own, i.e. the numbers can be added; an average can be computed; who has picked the most and the amount of money due each worker can be determined. At the same time, looking only at the numbers without having available the field and the situation of the workers loses information. For example, anybody looking at the field can see that one border of the field has much smaller onions than the rest. So the fact that the worker who has those rows produces fewer bags is not to be seen as lower productivity. We also lose information on how tightly filled the bags are or who is helping whom.

We call the foreman's record keeping a documentary assessment. Documentary assessments provide a stable rather than evanescent record of activities. They involve the use of some sort of representation, a system of symbols, be they numbers or hatch marks or notes, at any rate a set of marks on a piece of paper, a record of some kind, some documentation that is reflective and evaluative of the activity. In inherent and discursive assessment on the other hand, no persistent record gets created; the assessment molds the activity, but is not externally available.

In the evening, when the foreman compiles the numbers for his crews, he doesn’t care how “hard” or under what circumstances people worked, he only cares about what numbers he has to deliver to the big boss who will need them in order to pay the workers at the end of the week. He could single out particular workers as “bad apples” but in this particular case, since these work groups are comparatively autonomous, held together by ties of blood and fictive kinship, he won’t bother to do that (though that is definitely a dreaded possibility in many work

16 situations).

To summarize: Individuals undertake inherent assessments in in face-to-face situations, that is, when engaged in activities with other people in the immediate proximity of time and space. Discursive assessments, like inherent assessments, are made in situations of physical or virtual co-presence; however the reflections are made explicit mainly in form of verbal utterances. Through language, assessments gain the character of ‘social objects’ that members of a work group or social community can refer to, negotiate about, and point to even at some temporal and spatial distance. They generate a shared understanding of the situation, legitimate a distribution of power and establish shared rules of conduct. Documentary assessments can travel across extended spans of time and space. As symbolic representations, these records are designed to be relatively independent of individual actors who are engaged in the construction or use of the assessment. The abstraction from intricacies of the lived-in world makes possible a variety of operations like the comparison of different. Once produced, the content of a documentary assessment becomes immutable, yet by virtue of the missing context remains open to interpretation14. Table 1 summarizes the distinct features of the three types of assessments:

------

Insert Table 1 about here

------

THE LIMITATIONS OF DOCUMENTARY ASSESSMENTS

Unanticipated, unwelcome consequences of over-reliance on formal documentary measurements are well known to workers and managers alike. In addition, investigators from disciplines such as the anthropology of work and CSCW (Computer Supported Cooperative

17 Work) who have conducted detailed ethnographic studies of workplaces and work practices know that the introduction of formal evaluation schemes often generates unanticipated “work- arounds” that undermine the intent of the assessment. This phenomenon is widespread and it is indeed surprising that organizational and management theory largely ignores these costs.15

We see the potential negative consequences of documentary assessments as falling into three categories: a) manipulating the numbers, b) changing work practices, and c) codifying organizational structure, climate, and culture.

Sometimes only one or two of these occur. At other times, however, an appalling blend of all three can be observed. For example, when call center operators forward a troublesome call to cut it short so they can make their targets, this simultaneously changes call statistics, affects work practices and influences resource allocation and customer satisfaction.

Fixing the Numbers

In our own ethnographic workplace studies (Jordan, 1996, 1997; Tanur & Jordan, 1995) we have observed the pervasive occurrence of a coping strategy we call “fixing the numbers”. By that we mean instances where the numbers as reported and entered into the official record are actually different from those an independent observer would enter. For example:

 When we investigated collaboration in an airlines operations room, we learned about something called “delay coding.” It consists of assigning error codes to plane delays as they occur. This is done on a prominent whiteboard, visible to all ops room workers, where each category of error has an allowable monthly limit. What happens if, towards the end of the month, a particular error category is dangerously close to that limit? The delays are likely to be coded as due to a different error.

 In studies of telephone repair people we came to understand that the number of

18 service calls they are expected to complete often outstrips their working hours, especially when the weather is bad or they have several difficult calls in a row. What do they do? They use a

“customer-not-at-home” code for justifying service calls that were aborted under the pressure of an overwhelming workload.

It is obvious that such strategies produce flawed performance data, which subsequently may lead to wrong managerial decisions. This can produce grave overestimation of productivity and other crucial indicators of organizational well-being. Given the central role numerical data, trends, and projections play in managerial decision-making, this is a serious issue.

Changing Work Practices

Typically organizations introduce novel documentary assessments in order to impact work practice in a desired way: “What gets measured, gets done”. In most cases such seemingly straightforward thinking is too simplistic, though. In reality we observe quite problematic side effects including superficial, cosmetic adaptation to the requirements of documentary instruments and sacrificing overall goals to optimize performance as measured by some limited indicators. This is particularly likely when evaluation requirements have been imposed from the outside, as, for example, in certification and accreditation procedures. In that case, workers and line managers may collaborate to make things look right, with the silent agreement of top decision makers. For example:

 Bueno Castellanos, an industrial anthropologist, investigated changes that occurred in the Mexican auto parts industry when Quality Standard norms were introduced.

Conforming to these norms is required by Chrysler, Ford and General Motors (who to a large degree control the Mexican auto parts export business) in order to homogenize production standards. Under these circumstances, the interest of workers and managers at the Mexican plants is not in quality itself but in accreditation, which is achieved by successfully passing an

19 audit during a plant visit by outside auditors and filling out a variety of forms. When an audit is impending, frantic activities can be observed: All workplaces are cleaned and painted with special attention to machines, hallways, the dining area and dressing rooms. All vestiges of local technologies and practices, especially those that have to do with reworking parts, are hidden. For example, all evidence of pulque, the fermented juice of the agave plant (which is used to counteract rusting of inadequately stored parts) is carefully removed. 16 Workers are coached for the visit. Sometimes they are given a quickie course so they can answer auditors’ questions about quality in a way that paints the plant as conforming to Quality Standards (Bueno Castellanos,

2001).

Here as in similar cases, the threat of the economic consequences of a negative review unites workers and managers in jointly managing performance, practice and appearance during this crucial period.

 In the educational literature, there is lots of evidence that teachers “teach to the test.” Our own experience shows that this phenomenon does not stop at the doors of universities.

In a business school all freshmen were required to attend a particular entry-level course in business administration. The incoming class was taught in groups of 40 students each by different professors, one of whom happened to get interested in collecting the grades of all students, allowing comparisons of the performance of different classes. Over a period of a few years, this database became a prominent topic at faculty meetings. No longer just the product of one professor’s curiosity, faculty began to believe that the scores could be used not only to track student performance but also to assess professors’ excellence. Since grades were based on the results of two exams that were identical for all of the classes, professors developed all kinds of strategies for improving the test results for “their” classes: handing out copies of exams from recent years, devoting a part of their weekly lecture to discuss and solve the corresponding exam

20 questions, offering an extra meeting just before the test to help students get prepared, etc. This practice stood in stark contrast to the espoused goal of the course, helping students to develop their own critical perspective based on a solid knowledge of the subject matter.

In cases of this sort changes occur in the way the work is actually carried out after the introduction of formal measurement systems. But these adaptations make the process as a whole more cumbersome and more expensive since the time and effort spent in making these adjustments do not actually further the work that has to be accomplished. The persons or groups assessed change their practice in such a way that their performance, as it appears on the established indicators, is optimized, to the detriment of other objectives or efficiency criteria outside of the scope of documentary assessments.

Changes in Organizational Structure, Climate, and Culture

Sometimes the introduction of a documentary assessment triggers undesired changes in organizational climate and culture. Some examples:

 In a detailed investigation of the validity of responses to an employee satisfaction measurement instrument (Tanur & Jordan, 1995), we and company managers were initially quite satisfied to see that employee satisfaction rose significantly after the first few years of employment. Detailed investigation revealed that the results of the survey are given to managers together with a “Manager’s Guide to Feedback and Action Planning” which prescribes that supervisors and their direct reports hold meetings in which they deal with questions that show an above-threshold percentage of respondents expressing dissatisfaction. They then need to develop an action plan and will be questioned the following year about the extent to which it was carried out. Our interviews, focus groups and talk-aloud protocols revealed that new employees initially see the survey as a line of communication to management who they expect to remedy the problems the survey identifies. When that is not the case, they begin to see it as a futile exercise

21 that only adds to their workload. At some point, it becomes easier to agree with the vanilla- version of the world, i.e. confirm that everything is all right.

 The Quality Movement was a valiant effort to wed the best characteristics of deliberate assessment with those of documentary assessment, capitalizing on the strength of both.

Baker (1994), writing in the Fifth Discipline Fieldbook, documents how the spirit of the Quality

Movement was undermined by overzealous measurement at Ford. Deming’s original idea was that the workgroup would decide which of the hundreds of possible measurements of their work process was most valuable to them to give them continuous feedback about their performance.

But when management asked their plants for more figures, a number of interesting things began to happen. Indexes were developed and measurements were compared across plants. A whole new crew of bureaucratic intermediaries came into existence who performed operations such as computing averages and comparing highs and lows across plants and across quarters. This led to new levels of accountability – no longer were the assessment data used only to improve the group’s performance but team leaders now had to explain why their figures weren’t up to snuff and what plans they had made to improve them. The game in each plant changed from making improvements to making the plant look good against its neighbors.

 Gittell (2000) found similar effects in studying the airline industry. She focused on coordination mechanisms for optimizing flight departure processes, which is a highly dependent collaboration among gate agents, baggage agents, caterers, fuelers, cabin cleaners, flight attendants, pilots, and customers. In its effort to minimize expensive delay time American

Airlines increased its supervisory span and focused on accountability. It established a report system which forces the manager on duty to determine which function has caused the delay.

There were some unintended side effects. Employees looked out for themselves to avoid recrimination rather than focusing on shared goals of on-time performance, accurate baggage

22 handling and satisfied customers. The most frequent error was communication breakdown, but since the company had no code for that, managers tended to assign such errors to the last group off the plane. Unproductive debates, reports and meetings to determine cause of delays occurred frequently.

In situations of this sort, insidious changes may occur in the long run that, appearing gradually may remain “invisible in plain sight.” For one thing, there is likely to be an increase in personnel that is concerned with collecting and analyzing the relevant statistics. Administration of formal measurement requires a new layer of experts and a new layer of bureaucratic procedures, not so much concerned with manufacturing auto parts, teaching students or providing customer service, but with providing accounting evidence.

Whenever there is a temptation to fix the numbers, to focus on short-term success, and to deny responsibility for failure, one may see mistrust, competitiveness, passing-the-buck behavior, and what is known in the workplace as a “cover-your-ass” attitude. Some organizational theorists (e.g. Gittell, 2000) have argued that quantitative performance measurements inevitably generate some level of dysfunctional behavior. Such systems tend to operate with a relatively low level of trust. Indeed they are virtually guaranteed to produce low trust by setting up conflicts over who is responsible for problems. Stifling creativity and out-of- the-box thinking in favor of conformity and towing the line deprives the organization of otherwise available learning and innovation opportunities.

Drawing attention to the problematic sides of documentary assessments is not to be misunderstood as an argument against the use of documentary assessments per se. As we have been at pains to point out above, documentary assessments are fruitfully and appropriately used when the objective is comparison of large organizational units or subpopulations. In that case, stripping away the context, the why’s and how’s of a situation and the confusion of different

23 circumstance, is precisely what is necessary in order to make cross-unit comparison possible. In some managerial decision-making situations this is desirable. In others, however, the price is too high.

CONCLUSIONS: IMPLICATIONS FOR RESEARCHERS AND MANAGERS

Documentary assessments are ubiquitous in modern life. In the educational arena, the fates of school districts, programs and students are determined by test results. In business organizations, accounting is witnessing a continuous development of ever-new measurement schemes and metrics, improving its technical sophistication and broadening its scope. Yet, the consequences of documentary assessments for the work practice of front-line employees and managers, their impact on corporate decision-makers and their effect on the structure and culture of an organization remain largely unexamined (Caplan, 1992). The role of inherent and discursive assessments in organizational life has not been investigated at all.

Distinguishing inherent, discursive and documentary assessments and acknowledging their significance for the coordination and control of modern organizations will -- we hope -- help both researchers and managers to broaden their perspective beyond the dominant focus on traditional, yet often dysfunctional documentary assessments. In the following two sections we will outline how this three-way assessment framework could generate a) a novel research agenda and b) an approach for managers to create an environment supportive of all forms of assessments, meeting the different needs of individuals, groups and the entire organization alike.

Implications for Research

Our framework of inherent, discursive and documentary assessments has its roots in detailed observations of work practices in situ. Future research needs to extend such

24 investigations systematically and carefully to deepen our understanding of these analytically distinct yet practically often intermingled evaluative activities. Only by gaining a more solid awareness of how these forms of assessment function in different settings can we begin to harness their true potential.

Given their impact on productivity and worker morale, the detailed study of the work- arounds, short-cuts and exceptions that occur in all work and learning environments would be particularly fruitful. We need to understand these phenomena much better in order to accurately describe the impact documentary assessments have on work and teaching practice and to understand in what ways they affect the validity of the data on which management base their decisions.

The premier research methods for this kind of investigation are ethnographic methods, in particular participant observation aided by videotape analysis, mixed with real-time question- asking and documentary analysis (Blomberg, Giacomi, Mosher, & Swenton-Wall, 1993;

Eisenhardt, 1989; Jordan, 1996, 1997; Jordan & Henderson, 1995). Given these methodological choices, however, it is often not obvious where to begin if one wants to investigate assessment practices in situ. The following paragraphs provide some guidance as to where to study inherent, discursive and documentary assessments.

Research on inherent assessments must be grounded in observing individuals pursuing their everyday work. One might want to pay special attention to interactions where instances of adjustments and alignments with coworkers and customers occur. “Hitches” in interactions and how people fix them are good opportunities to watch inherent assessments in action. Following newcomers to find out how they manage to fit into their position, how they learn to align their own activities with the activities of others in order to fulfill the requirements of their tasks, will also prove fruitful.

25 A whole new research agenda opens up in the investigation of inherent assessment in geographically distributed teams. At this point we know very little about the ways in which the new communication technologies affect inherent assessment in distance collaboration.

Preliminary studies have shown that even video-supported interactions between remote teams suffer from the effects of technology-generated delays in transmission, affecting the establishment of trust and judgments of competence (Ruhleder & Jordan, 2001). To what extent these technologies undermine inherent assessment (and, for that matter, discursive assessment) is unknown at this time.

Discursive assessments can be found in all kinds of formal and informal get-togethers.

Work-groups with a high level of internal control, such as research teams, groups of consultants and semi-autonomous workgroups, might already dedicate parts of their regular meetings to internal reflection and evaluation. These are good places to look. Whenever we have kept track of employees’ talk during lunchtime, we found that often a substantial part of lunchroom conversations has to do with work issues. Here is where assessment becomes discursive, and where it is possible to learn how the common views get built that provide the basis for coordinated efforts later in the workday. In general, places where people can get together casually such as hallways, water coolers, coffee machines and so on are a good venue for studying group-endogenous assessment practices.

It is also worth considering that different kinds of work provide different kinds of homes for discursive assessments depending on the temporal structure of the work. For example, in an airlines operations room we studied, the work revolved around nine “complexes” per day.

During each complex, a group of planes comes in, exchanges passengers, crews and baggage, gets serviced and moves out again. The sequence of complexes establishes the rhythmicity of the work. We found that discursive assessments tended to occur with increased frequency during the

26 down time between the complexes. We have also observed such rhythmicity in software development teams producing a new program version, construction crews moving from site to site, and teachers meeting in breaks between their classes and at certain times throughout a semester. Most of these cycles include slow-down periods where joint group reflection takes place with increased frequency. In work that is relatively uniform and linear, like the flow on classic assembly lines or the steady stream of incoming calls in call centers, discursive assessments are more likely to be conducted in the form of conversations and chats in locker rooms and cafeterias, parking lots, the smokers’ gathering outside the back door, and the like.

For documentary assessments it is beneficial to look at “accounting talk” in all its forms, how it unfolds in informal and formal meetings and how it affects a wide range of decisions, perceptions and objectives.17 “Accounting talk” is concerned with data produced by documentary methods, that is to say, with numbers, statistics and trends. It takes place in public debates within school districts, parent-teacher discussions, and school board meetings. In business organizations you may find it in quarterly ops reviews, budget meetings, planning sessions, employee performance appraisals and so on where it guides and controls everyday work processes of frontline-workers and managers alike. However, Feldman and March made a convincing argument that much of the data collected by documentary methods is not used for the corporate decision-making purposes for which it was ostensibly collected. They suggest that in a society that places high value on intelligent choice based on rational, explicit decision criteria the gathering of quantitative information “provides a ritualistic assurance that appropriate attitudes about decision making exist. Within such a scenario of performance information is not simply a basis for action. It is a representation of competence and a reaffirmation of social virtue”

(Feldman & March, 1981: 177).

It may well be the case that the dynamics of decision-making are best understood by

27 introducing the notion of “Authoritative Knowledge” or AK, proposed initially in the medical field where it is often the case that two parallel forms of knowledge exist: the knowledge of the physician and the knowledge of the patient, or biomedical knowledge paralleled by other medical knowledge systems, e.g. acupuncture, Ayurvedic medicine and the like. In certain situations, one form of knowledge acquires a privileged status at the expense of the others, which become invisible and go underground. The authoritative form then becomes the basis for official, legitimate decision-making (Davis-Floyd & Sargent, 1997; Irwin & Jordan, 1987; Jordan 1989,

1990, 1992; Suchman & Jordan, 1990).

Paul Starr gives a compelling account of the historical transformation of authoritative knowledge in medicine in 19th century America, describing the transition from a of multi- stranded, pluralistic medical system of barber surgeons, homeopaths, midwives, and other empirically based practitioners into a system based on allopathic medicine as the dominant form

(Starr, 1982).18 It may be productive to investigate how and why documentary assessment has also attained the status of AK, the knowledge on the basis of which decisions are made. Such an approach would encourage the simultaneous consideration of all three assessment types.

Further on the conceptual/theoretical side, it would also be of interest to further develop

Giddens (1984) work on structuration and its relevance to accounting practices, a direction that has already been taken by Roberts and Scarpens (1985), Scarpens and Macintosh (1996) and

Boland (1996). Beyond that the approaches espoused by such fields as Computer Supported

Cooperative Work (CSCW), Work Practice Analysis (WPA) and corporate anthropology might be explored.

To summarize: We see two particularly urgent directions for research emerging: the first would focus on detailed ethnographic work that could elucidate the “talking and doing” of assessment wherever and whenever it occurs; the second would focus on broadening and

28 restructuring the theoretical underpinnings of current thinking about evaluations and measurements.

Implications for Management

The three-part framework of inherent, discursive and documentary assessments we are proposing delivers two main messages to managers: a) don’t focus on documentary assessments alone; nurture and support inherent and discursive assessments; b) if documentary assessments are carried out, be critical and reflective about their purpose, their methods and their effects. This will require some measure of expertise to identify positive as well as negative work-arounds and other consequences of documentary assessments by managers themselves or through hiring social science experts who can undertake such investigations. Interestingly, a number of large corporations, including Xerox, Motorola and Intel, now have applied anthropologists, sociologists and psychologists on their staff who specialize in such analysis.

One might think that inherent assessments are beyond management’s sphere of influence since they are accomplished by individuals on the fly, seamlessly integrated in their work activities and therefore often unrecognized. This is certainly true. However, organizations can create physical and social environments that make it easier for individuals to acquire the tacit knowledge that allows them to function as competent members of their workgroup and of the entire organization. Given how important inherent assessments are for facilitating change processes and for socializing novices into functioning communities of practice, companies might want to think about the design of work environments that facilitate unproblematic alignment. For example, a study conducted in a big customer call center revealed that lower cubicle walls supported individuals in their inherent assessments by allowing easy monitoring of and adjustment to experienced workers’ verbal and nonverbal practices (Whalen & Whalen, 1996).

29 In general, it seems advisable to create work processes that allow easy exchange among peers and to nurture informal exchange between coworkers by providing places where people can get together. Isolating workers, and particularly novices, in solitary cubicles is clearly not conducive to either inherent or discursive assessment.

Discursive assessments are a valuable source for supervisors and senior managers to learn

“what really goes on” within a work group, a type of information that is not likely to be captured in any kind of documentary assessment. Managers need to develop a sense of where and when group-endogenous evaluations happen and how they can listen to them. We mean this not in the sense of hidden overhearing or spying, but in the sense of building an open communication framework that encourages employees to make managers part of their communicating circle.

Discursive assessments can be supported actively by providing space and occasions for all kinds of informal meetings -- often the most creative ideas and solutions for nagging problems are initiated by “chats” in cafeterias, hallways, and coffee booths. Another way is to create work processes and organizational structures that allow and encourage work groups and communities of practice to engage in evaluations. The delegation of responsibility for the design of work processes and for the internal division of labor to work groups is one way to encourage discursive, group-endogenous assessments structurally.

Documentary assessments, as we have shown, often do not deliver what their creators had in mind. What can managers do to avoid invalid data collection and dysfunctional work- arounds? In the first place, managers need to be aware that reports, surveys and questionnaires are quite powerful instruments for shaping perceptions and behavior. The crucial point is that whenever a new measurement procedure gets designed and applied, it is critical to find out how documentary assessments actually influence local work processes and accounting practices.

A serious set of issues for managers revolves around the possibility that the data they use

30 for making decisions do not indicate what they think they indicate.19 For example, if the number of calls completed in a call center reaches target, this may not indicate a desirable level of productivity but rather the fact that call takers have forwarded or otherwise terminated a number of calls in order to reach target, a situation that would negatively affect customer satisfaction. If errors are tabulated in the wrong category, decisions regarding error management will be falsely informed. If teachers teach to the test, the test measures not what students know about physics, but how well they are prepared to take tests in physics. Managers often know that the data they get are seriously flawed but see no easy way to attack the problem.

Paying attention to the work-arounds generated by documentary assessments may be one way to move towards a solution. In many cases they can be turned to be a source of creative improvements fixing shortcomings of a current assessment in a way that eventually increases the efficiency of a production process. Actually, within the field of knowledge management, workers’ ability to engage in appropriate work-arounds is seen as part of the stock of tacit knowledge that communities of practice develop in order to get their work done. Almost by definition, tacit knowledge as embedded in work practices is hard to grasp for managers, especially for higher-level managers since their view on the vicissitudes of empirical workplaces remains hidden behind the authoritative de-contextualized knowledge available to them.

Managers worried about these insufficiencies frequently react with a call for even more and more refined data which obviously does not transcend this tunneled view. The only way out is to look beyond the scope of pre-defined indicators directly at the way work practice and documentary assessments mutually affect each other.

The following example is illustrative of the typical complexity of relationships between negative and positive effects of the imposition of a documentary assessment:

Before the introduction of a new workflow management system in a printing company in

31 the UK, workers routinely monitored each other’s work (Bowers, Button & Sharrock, 1995).

Inherent and deliberate assessments were made easy through the layout of the shop floor, which provided lines of sight between workers, allowing them to engage in ad hoc collaboration. For example, one operator might replenish the paper on a job whose initiator was busy with something else. If a job turned out to be more labor-intensive than originally assumed, they would often split it into two. They might also start weekly or monthly repeat orders ahead of time even without an assigned order number in hand, and juggled jobs within the production line to maximize efficient workflow.

At one point, a formal workflow management system was introduced with the goal of providing more accountability for billing. The new system allowed only one operator to be assigned to each job, and required that jobs be carried out strictly in the order received. The workflow that was formerly organized informally became straight-jacketed, more cumbersome, and less efficient. Different work groups, wanting to maintain responsiveness to customer needs, developed new work-arounds to soften the undesired impacts of the workflow system. For example, to get around the requirement to record detailed workflow data even for small print jobs, workers often bundled similar jobs under a single order number (which undermined billing accuracy). They invented special cost categories for situations where one worker was handling multiple jobs simultaneously. In the extreme, workers did not record the ongoing process in the computer at all but made hand-written notes that they entered into the system at the end of the day. These innovative work-arounds increased the compatibility of the workflow system with a time-efficient mode of production, but eventually increased cost and partially undermined the accountability the new system was supposed to improve.

In this situation the workers who violate the ruling order as given by headquarters at the same time prevent their local terrain from chaos by establishing a new order on their own.

32 Though this order tends to be messy and its creation and maintenance is likely to produce serious trade-offs, it helps to alleviate negative impacts of the new system of accountability. Clearly, situations of this sort require exquisite alertness on the part of managers and a careful weighing of the negative and positive effects of the changes. Carefully observing the ways in which frontline employees connect a documentary assessment to their specific work processes and how the modify them, not only helps to improve the assessment instrument and it’s application but also provides a refined insight into the mechanisms and contingencies of the underlying work process itself.

33 REFERENCES

Ahrens, T. 1996. Styles of accountability. Accounting, Organizations and Society, 21: 139-173.

Ahrens, T. 1997. Talking accounting: an ethnography of management knowledge in British and

German brewers. Accounting, Organizations and Society, 22: 617-637.

Baker, E. 1994. Springing ourselves from the measurement trap. In P. M. Senge, A. Kleiner, C.

Roberts, R. Ross, & B. Smith (Eds.), The fifth discipline fieldbook: strategies and tools

for building a learning organization: 454-457. New York: Currency Doubleday.

Barley, S. R., & Kunda, G. 2001. Bringing work back in. Organization Science, 12: 76-95.

Blomberg, J., Giacomi, J., Mosher, A., & Swenton-Wall, P. 1993. Ethnographic field methods

and their relation to design. In D. Schuler, & A. Namioka (Eds.), Participatory design:

principles and practices: 123-154. Hillsdale, N.J.: L. Erlbaum Associates.

Boland Jr, R. J. 1996. Why shared meanings have no place in structuration theory: a reply to

Scapens and Macintosh. Accounting, Organizations and Society, 21: 691-687.

Bowers, J., Button, G., & Sharrock, W. 1995. Workflow from within and without: technology

and cooperative work on the print industry shop floor. In H. Marmolin, Y. Sundblad, &

K. Schmidt (Eds.), Proceedings of the Forth European Conference on Computer-

Supported Cooperative Work: 51-66. Dordrecht: Kluwer.

Brun-Cottan, F., Forbes, K., Goodwin, C., Goodwin, M. H., Jordan, B., Suchman, L. A., &

Trigg, R. 1991. The workplace project: designing for diversity and change (videotape):

Xerox Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304.

Bueno Castellanos, C. 2001. The "Glocalization" of Global Quality. Practicing Anthropology,

23: 14-17.

Caplan, E. H. 1992. The behavioral implications of management accounting. Management

International Review, 32(Special Issue): 92-102.

34 Davis-Floyd, R., & Sargent, C. F. (Eds.). 1997. Childbirth and authoritative knowledge: cross-

cultural perspectives. Berkeley: University of California Press.

Eisenhardt, K. M. 1989. Building theories from case study research. Academy of Management

Review, 14: 532-550.

Feldman, M. S., & March, J. G. 1981. Information in organizations as signal and symbol.

Administrative Science Quarterly, 26: 171-186.

Giddens, A. 1984. The constitution of society: outline of the theory of structuration.

Cambridge Cambridgeshire: Polity Press.

Gittell, J. H. 2000. Paradox of coordination and control. California Management Review, 42:

101-117.

Goffman, E. 1963. Behavior in public places: notes on the social organization of gatherings.

New York: Free Press.

Goodwin, C. 1986. Between and within: alternative sequential treatments of continuers and

assessment. Human Studies, 9: 205-217.

Goodwin, C., & Goodwin, M. H. 1987. Concurrent operations on talk: notes on the interactive

organization of assessment. IPRA Papers in Pragmatics, 1: 1-54.

Goodwin, M. H. 1980. Processes of mutual monitoring implicated in the production of

description sequences. Sociological Inquiry, 50: 303-317.

Irwin, S., & Jordan, B. 1987. Knowledge, practice and power: court-ordered Cesarean sections.

Medical Anthropology Quarterly, 1: 319-334.

Jordan, B. 1989. Cosmopolitical obstetrics: some insights form the training of traditional

midwives. Social Science and Medicine, 28: 925-944.

35 Jordan, B. 1990. Technology and the social distribution of knowledge: issues of primary health

care in developing countries. In J. Coreil, & D. Mull (Eds.), Anthropology and primary

health care: 98-120. Boulder: Westview Press.

Jordan, B. 1992. Technology and social interaction: notes on the achievement of authoriative

knowledge in complex settings. IRL Technical Report No. IRL92-0027. Palo Alto, CA:

Institute for Research on Learning.

Jordan, B. 1993. Birth in four cultures: a crosscultural investigation of childbirth in Yucatan,

Holland, Sweden, and the United States (4th expand. ed., rev. by Robbie Davis-Floyd

ed.). Prospect Heights, Ill.: Waveland Press.

Jordan, B. 1996. Ethnographic workplace studies and CSCW. In D. S. Shapiro, M. J. Tauber, &

R. Traunmüller (Eds.), The design of computer supported cooperative work and

groupware systems: 17-42. Amsterdam ; New York: Elsevier.

Jordan, B. 1997. Transforming ethnography - reinventing research. Cultural Anthropology

Methods, 9: 12-17.

Jordan, B., & Henderson, A. 1995. Interaction analysis: foundations and practice. The Journal

of the Learning Sciences, 4: 39-103.

Latour, B. 1986. Visualization and cognition: thinking with eyes and hands. Knowledge and

Society: Studies in the Sociology of Cultures Past and Present, 6: 1-40.

Miller, P. 1994. Accounting as social and institutional practice: an introduction. In A. G.

Hopwood, & P. Miller (Eds.), Accounting as Social and Institutional Practice: 1-39.

Cambridge: Cambridge University Press.

Pomerantz, A. 1978. Compliment responses: notes on the co-operation of multiple constraints. In

J. Schenkein (Ed.), Studies in the organization of conversational interaction: 79-112.

New York: Academic Press.

36 Pomerantz, A. 1984. Agreeing and disagreeing with assessments: some features of

preferred/dispreferred turn shapes. In J. M. Atkinson, & J. Heritage (Eds.), Structures of

social action: studies in conversation analysis: 57-101. Cambridge: Cambridge

University Press.

Roberts, J., & Scapens, R. 1985. Accounting systems and systems of accountability -

understanding accounting practices in their organizational contexts. Accounting,

Organizations and Society, 10: 443-456.

Ruhleder, K., & Jordan, B. 2001. Co-constructing non-mutual realities: delay-generated trouble

in distributed interaction. Computer Supported Cooperative Work, 10: 113-138.

Scapens, R., & Macintosh, N. B. 1996. Structure and agency in management accounting

research: a response to Boland's interpretive act. Accounting, Organizations and Society,

21: 675-690.

Schön, D. A. 1983. The reflective practitioner: how professionals think in action. New York:

Basic Books.

Star, S. L., & Griesemer, J. R. 1989. Institutional ecology, "translations" and boundary objects:

amateurs and professionals in Berkeley's Museum of Vertebrate Zoology, 1907-39.

Social Studies of Science, 19: 387-420.

Starr, P. 1982. The social transformation of American medicine. New York: Basic Books.

Suchman, L. A., & Jordan, B. 1990. Interactional troubles in face-to-face survey methods.

Journal of the American Statistical Association, 85: 232-241.

Tanur, J., & Jordan, B. 1995. Measuring employee satisfaction: corporate surveys as practice. In

American Statistical Association. Survey Research Methods Section (Ed.), Proceedings

of the Section on Survey Research Methods: 426-431. Washington, D.C.: American

Statistical Association.

37 Vaughan, D. 1999. The dark side of organizations: mistake, misconduct, and disaster. Annual

Review of Sociology, 25: 271-205.

Whalen, J., & Whalen, M. 1996. Social aspects of learning on the shop floor: peer coaching

and work organization. Paper presented at the Annual Meeting of the American

Sociological Association, New York.

38 TABLE 1

Characteristics of Different Assessment Types

Inherent Discursive Documentary

Assessment Assessment Assessment primary individual team/social group corporate entities stakeholder mode of coordinated talk, written symbols coordination activities coordinated reasoning

nonverbal how conversation record production monitoring

to adjust individual to coordinate group to coordinate and why conduct activities control organizations knowledge tacit explicit explicit produced time and space physical and virtual co- physical co-presence distance presence duration fleeting short lived enduring

emotional tone affective affective objective

39 1Endnotes

1 Much of Jordan’s thinking on these issues developed in conversations with colleagues at

IRL and PARC, beginning with a series of seminars headed by Jim Greeno in the early 90s when many of these ideas first surfaced. Putz has a long-standing interest in the practice of evaluation and accounting in organizations.

2 We use the general term "assessment" to comprehensively refer to the totality of informal and formal judgments, evaluations, measurements, tests, surveys and metrics that play a role in productive social interaction.

3 In addition, assessment also occurs on the individual level. Even when we are alone we evaluate our thoughts, intentions and actions but one could argue that, too, occurs in relation to a social world that provides the relevant criteria and standards.

4 Some of the examples in this paper come from work carried out in the Interaction Analysis

Lab (IAL) at PARC and IRL (Jordan & Henderson, 1995) where groups of researchers engaged in the microanalysis of videotapes of ordinary events and activities in workplaces, schools, and elsewhere. This particular example is from a tape from the research of Maureen Callanan and Jeff

Schrager (IAL of 02-19-91).

5 This example is from fieldwork in Yucatan, Mexico (Jordan, 1993).

6 We are using the terms "work" and "business" not in the labor or market place sense but simply as a way to designate serious attention and serious intent by participants. In this usage, children at play are engaged in the "business" of constructing a castle and a mother and children baking muffins for fun and breakfast are doing "work."

7 This example is from ethnographic research in an airlines operations room carried out by the Work Practice and Technology group at Xerox PARC in the early 90’s (Brun-Cottan et al.,

1991; Jordan, 1992).

8 Goffman (1963) pointed out that continuous reflexive monitoring is a defining characteristic of all situations of co-presence, that is, instances of focused and unfocused interaction among human agents. Similarly, Giddens claims that “reflective capacities of the human actor are characteristically involved in a continuous manner with the flow of day-to-day conduct in the contexts of social activities” (1984: xxii).

9 Donald Schön’s (1983) “reflection in action” is equivalent to our discursive assessment.

10 What is interesting is to speculate to what extent remote work systematically cuts out these kinds of interactions and what damage that does. The impossibility of discursive assessments in remote work may account for some of the difficulties people experience in those situations.

11 What we mean to characterize here is the phenomenon of discursive assessments as they occur among co-workers who share a joint orientation based on their daily work practice. We are aware of the fact that in some work situations there may exist a shifting boundary between endogenously produced discursive assessments and more or less formal post-event “review sessions”. Externally mandated, the latter may tend to reflect the interests of larger organizational units rather than furthering the in-the-moment activities of the group.

12 This example is from Jordan‘s fieldwork with migrant farm laborers in California in the early 70’s.

13 The function of this type of assessment is similar to a discussion about a student's work in a classroom, when the intent is simply for the student/teacher pair to better understand how effectively they are teaching/learning. The assessment here is made explicit; it becomes remarkable and “talk-about-able”.

14 One commentator observed in an insightful remark that one difference between discursive and documentary assessments is that discursive assessments have a strong future orientation (in the sense that they tend to affect immediate next action) while documentary assessments draw their future implications from a concern with documenting the past.

15 Since potentially negative consequences of documentary assessments are of tremendous importance for business, it is rather remarkable that there are no publications in the organizational literature that specifically focus on this key phenomenon. Why? It appears that detailed, on-the- ground workplace studies, quite common in the first part of the last century (e.g. the famous

Hawthorne study and many others), became unfashionable in the latter part of the century, a situation well documented by Barley and Kunda (2001).

Diane Vaughan, casting the net rather wider than we have with work-arounds, provides a comprehensive survey of the sociological literature writing about “The Dark Side of Organizations” in the Annual Review of Sociology. She locates unanticipated suboptimal outcomes in “routine nonconformity and the systematic production of organizational deviance” (Vaughan, 1999).

16 Pulque is better known as the basis for producing guaro, strong liquor.

17 We borrow the term “accounting talk” from Ahrens (1996, 1997) who investigated accounting talk between accountants and sales representatives in German and British breweries.

18 Though he doesn’t use the term “authoritative knowledge”, Starr introduces the idea of

"cultural authority," which refers to the probability that particular definitions of reality and judgments of meaning and value will prevail as valid and true. He argues that the acquisition of cultural authority had the consequence that doctors came to be in charge of "the facts," that is, to have the authority to define when somebody is dead or alive, sick or well, competent or not (Starr

1982).

19This is actually a question of data validity if by validity we mean the extent to which a measuring instrument actually measures what its user thinks it measures.