<<

Jointly published by Akadémiai Kiadó, Budapest , and Springer, Dordrecht Vol. 63, No. 1 (2005) 87–120

Exploring size and agglomeration effects on public productivity

ANDREA BONACCORSI,a CINZIA DARAIOb

a University of Pisa, Pisa (Italy) b IIT-CNR and Scuola Superiore S. Anna, Pisa (Italy)

The paper assesses the empirical foundation of two largely held assumptions in policy making, namely scale and agglomeration effects. According to the former effect, scientific production may be subject to increasing returns to scale, defined at the level of administrative units, such as institutes or departments. A rationale for concentrating resources on larger units clearly follows from this argument. According to the latter, scientific production may be positively affected by external economies at the geographical level, so that concentrating institutes in the same area may improve scientific spillover, linkages and collaborations. Taken together, these arguments have implicitly or explicitly legitimated policies aimed at consolidating institutes in public sector research and at creating large physical facilities in a small number of cities. The paper is based on the analysis of two large databases, built by the authors from data on the activity of the Italian National Research Council in all scientific fields and of the French INSERM in biomedical research. Evidence from the two institutions is that the two effects do not receive empirical support. The implications for policy making and for the theory of scientific production are discussed.

Introduction

In recent years policy making in the field of science and public research has been influenced by the attempt to apply economic concepts. The pressure on public budgets in almost all industrialised countries has lead governments to pursue (or at least to declare they pursue) efficiency in the allocation and management of resources in the public research sector. The increasing societal demand for accountability and transparency of science also makes it important to demonstrate that public funding follows clear rules. A clear manifestation of this trend is the effort to apply to public scientific research two very fundamental concepts drawn from economic analysis, that are, increasing returns to scale or economies of scale, and external economies or economies of agglomeration.

Received November 2, 2004

Address for correspondence: CINZIA DARAIO Institute for Informatics and Telematics (IIT), Consiglio Nazionale della Ricerche (CNR) Area della ricerca di Pisa , Via G. Moruzzi, 1; I-56127 Pisa, Italy E-mail: [email protected], [email protected]

0138–9130/US $ 20.00 Copyright © 2005 Akadémiai Kiadó, Budapest All rights reserved A. BONACCORSI, C. DARAIO: Size and agglomeration effects

If these two forces were at play in scientific research, then a sound policy implication would be that in order to improve the efficiency of public research resources should be concentrated into larger institutions and/or into geographically agglomerated areas. This paper explores scale and agglomeration effects in scientific research with reference to two large European public research institutions, the Italian National Research Council (CNR) in several research areas, and the French INSERM in the biomedical field.

Scale and agglomeration economies in scientific research

In the attempt to apply economic concepts to science by means of analogy, it is assumed that institutes and departments are analogous to firms, using production factors or inputs in order to obtain scientific output. This analogy raises several problems. First of all, there is an important identification problem: what is the unit of production in scientific research? On one hand, it has been argued that the appropriate unit of analysis for production is the laboratory or team (LAREDO & MUSTAR, 2001). Researchers are members of several projects, that cut across administrative boundaries of institutes. At the same time, it is still true that all researchers are generally members of an institute or department defined by discipline or thematic field. While direct production takes place in laboratories and within teams, still the institutional level of institutes and departments makes sense. In general, it must be recognized that organizational arrangements may differ across scientific disciplines (SHINN, 1979; WHITLEY, 1984) and that empirical research should try to keep these differences into account. As an example, in this paper we provide data for several disciplines in the Italian case of CNR; furthermore, within a single large and diversified field for which data are available (i.e. biomedicine) we also provide comparative results between a set of institutes in two national institutions (INSERM in France and CNR in Italy). Second, there are several measurement problems for both inputs and outputs. Among inputs to scientific production the following are considered: (i) number of researchers, possibly classified by category (i.e. directors, senior researchers, junior researchers, post-doc. and Ph.D. students), age, seniority (i.e. number of years in the field), disciplinary background, and quality (i.e. cumulated number of publications, or citations, or impact factor); (ii) stock of capital equipment; (iii) research funds; (iv) stock of past knowledge (as measured for example by cumulated number of publications at the level of institute).

88 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

A number of severe measurement and practical problems make the complete analysis almost impossible. In practice, it is enormously difficult to collect data on all these items for a sufficiently long period of time. Within relatively homogeneous research areas it is considered acceptable to utilise a subset of inputs such as number and category of researchers, or number of researchers and research funds. Data on the stock of capital equipment are not easily available. On the side of research outputs, other problems are at play. For most purposes, especially within relatively homogeneous research areas, a simple count of publications is considered acceptable. A more complete treatment, however, should distinguish between quantity of output, its quality and impact (as measured by citations received) and its relevance (as measured by subjective evaluations of experts in the field). In addition, relevant output of scientific production also include teaching, applied research and consultancy for industry and third parties, patenting, and the like. Consequently, not only scientific production is inherently multi-input multi-output, but all inputs and outputs are heterogeneous and cannot be easily measured using commensurable variables. Finally, the specification of the relation between inputs and outputs is another difficult conceptual problem. This relation is likely to be non-deterministic, have a lagged structure, and have a time sequence which is variable over time and across sectors. In the light of these characteristics, any meaningful measure of productivity should be generated by a model of multi-input multi-output production without a fixed functional specification. Despite these severe identification, measurement and specification problems and the resulting difficulties in testing specific predictions, the idea that scientific production must exhibit some relation between the resources employed and the output produced is generally accepted. For practical and policy objectives simple measures of the ratio of output to input are considered an indicator of scientific productivity. As an example, the crude number of paper per researcher, within relatively homogeneous fields, is considered an acceptable indicator of productivity across large numbers.∗ Having established the analogy between scientific research and production, and apart from the methodological problems discussed above, two questions can legitimately arise. Let us state them as follows: (a) does the concentration of resources over large institutions or institutes improve scientific productivity? In other words, is there in the of science the same phenomenon called economies of scale in production? (b) does the territorial concentration of scientists improve scientific productivity? In some countries a policy of locating laboratories and research institutes in the same

∗ The use of simple ratios within a context of multi-input multi-output production, of course, can be criticized. See LINK (1996) for a discussion of limitations of any production function approach in science.

Scientometrics 63 (2005) 89 A. BONACCORSI, C. DARAIO: Size and agglomeration effects territorial area has been actively pursued, with a view of creating so called economies of agglomeration. Does this policy improve the production of scientific publications?

Economies of scale in scientific production

In the context of manufacturing production, economies of scale refer to the fact that an increase of k times in all factors of production determines an increase in output of more than k times. Therefore the larger the scale of production (i.e. productive capacity of plants), the lower the unit or average cost in the long run. To claim that increasing returns to scale are at play one must increase simultaneously all factors of production, not only the variable ones (i.e. work). It is useful to distinguish between economies of scale at the level of plant and at the level of firm. The latter may be limited to manufacturing costs for several plants or include also managerial costs for non- manufacturing activities. The counterparts of the plant or the firm in scientific production are not uniquely determined. In principle, one should consider the smallest unit of production at which fixed factors of production such as physical equipment are utilised, i.e. the research laboratory. However, because some resources (e.g. facilities, instrumentation, technical personnel) are shared across laboratories, the institute or department is also a meaningful level of observation. Finally, one could consider also the overall university or the public research institution as an appropriate level of observation, given that several decisions about the allocation of resources (e.g. funds, personnel) are taken at this level. According to the data utilised, the former level (laboratory) may be considered the counterpart of the plant, while the institute and the university or public institution levels are similar to the firm level or the multidivisional company, respectively. In empirical work, due to the difficulty to analyse laboratories, most studies focus on the institute or the department or the university (RAMSDEN, 1994; JOHNSTON, 1994; ADAMS & GRILICHES, 2000). This notion, applied to science, means that research units should be of large size, in order to optimise the use of productive resources and increase productivity. The higher the size of units, the higher scientific productivity. This notion is often invoked to support policies of concentration of resources in larger institutes, forcing small institutes to merge or disappear, or policies of merger and consolidation of scientific institutions. The keyword for these policies is critical mass. As it has been noted “a prominent feature of research support policy in many, though not all countries, over the last twenty years has been the espousal and implementation of resource allocation processes that provide ‘selectivity and concentration’. Implicit in these policies has been the assumption that ‘bigger is better’; in other words, that scientific research benefits from economies of scale. This approach has been most pronounced in the UK and to some extent other Anglo-Saxon derivative

90 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects countries, but it has been the subject of consideration and experiment in many other countries as well” (JOHNSTON, 1994, p. 25–26). Public policies based on critical mass and large institutes induce levels of concentration of resources that go beyond the usual level. Concentration is a robust structural property of institutional systems that allocate research funds in proportion to publishing output. Since publication activity follows a strongly asymmetric distribution, it is not surprising that research funds are not allocated on a uniform basis. In a sample of Australian researchers in 18 universities, RAMSDEN (1994) found that 14% of researchers was responsible for 50% of all publications in the 1985–1989 period, while 40% published 80% of the total. Approximately the same ratio was found by COLE, COLE & SIMON (1981) and RESKIN (1977) for US universities (15% of researchers published 50% of the total), while HALSEY (1980) found comparable concentration ratios for British universities and polytechnics (23% of researchers published 68% of the total). As a consequence, a small number of universities that follow a consistent policy of hiring scientists with a strong publication record absorb a large share of funds: in the US more than 50% of national budget for universities is concentrated in the top 33 universities. In UK the top 6 universities absorb around 50% of the total. Policies aimed at concentration do not simply follow the structural asymmetry of distribution of publication activity, but aim to actively improve productivity. In Italy, for example, the recent legislative reform of the National Research Council (Reorganization Decree no. 19 of 1999) has induced a profound change in the administrative structure. The number of institutes has been reduced from 314 in 1999 to 108 in 2001. Many of the smallest institutes were, in effect, the result of fragmentation processes, created around a few researchers and crystallised over time. Given that the administrative burden is, at least to a certain extent, a fixed cost associated to service indivisibility, the existence of a minimum efficient scale for administrative costs is plausible. It should be noted, however, that policy decision makers are often driven by a more general notion that research activity itself, and not merely its administrative side, is subject to increasing returns to scale. In other words, policy decision makers implicitly apply notions from economics to the research activity, drawing analogies between manufacturing and the production of knowledge. The analogy is based on the idea that research, like manufacturing, is subject to (a) division of labour; (b) indivisibility in the use of a minimum number of diverse competencies; (c) utilisation of large physical infrastructure. These reasons are sufficient conditions for the emergence of increasing returns to scale in several industries in the manufacturing sector (SCHERER, 1980; MILGROM & ROBERTS, 1992; MARTIN, 2002). This analogy may be severely misleading, however, for several reasons.

Scientometrics 63 (2005) 91 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Division of labour. As it is well known, the larger the size of production units, the better the subdivision of production into specialised tasks that maximise efficiency. However, there are fundamental differences between productive division of labour and cognitive division of labour. In science, the output of any individual is made public via publication, so that any other scientist may benefit from his contribution and add to it. Knowledge stored in publications allows division of cognitive labour to take place in different places and periods of time. Publication is therefore the most important mechanism for promoting division of cognitive labour. This means that placing scientists within the same organisational boundaries is neither a necessary nor a sufficient condition for benefiting from improved division of labour. There may be a form of division of labour that requires the establishment of formal collaboration and coordination of tasks between scientists. It is useful to draw a distinction between division of labour among peers, and division of labour along the research career, or among scientists with different seniority. The former type takes the form of personal links, based on mutual recognition and professional esteem. Since the most important personal assets that established scientists bring into collaborations are competence and reputation, the boundaries of personal links tend to follow spontaneously the actual distribution of these elements, often on a world basis. Only occasionally one can find the entire web of personal peer relationships included within the boundaries of a single organisation. A different type of division of labour takes place between scientists at various stages of careers, and between scientists and technicians or assistants. In the former case the pattern of personal relations is based on apprenticeship and scientific leadership and requires long periods of joint work and supervision, normally (but not necessarily) within the same institution. In the latter case a chief scientist organises the work of a number of people having different roles, taking the scientific responsibility of projects. Because both types of division of labour require personal in-depth supervision, the size of resulting units is limited by the ability of research directors to monitor closely the work of their research students and collaborators and to contribute to their training. In most scientific fields this amounts to say that the maximum size is quite small, in the order of units or one or two dozens. Again, this argument must be made domain-dependent, since the size that may favour the cognitive division of labour may be very different across disciplines (e.g. LATOUR & WOOLGAR, 1979; SHINN, 1982). Summing up, it is unlikely that division of labour per se is a source of increasing returns to scale at the level of institutes across all disciplines. Indivisibility. Indivisibility is a serious argument. In many areas the production of scientifically meaningful output requires the combination and coordination of many scientists from different fields, bringing competencies in the substantive field and in complementary areas such as measurement techniques, statistical analysis, scientific

92 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects computing, software development, data and image processing and analysis and the like. In some areas the very substantive field requires the integration of different disciplinary backgrounds. As a result the notion of minimum size of a research unit is economically sensible (see for example COHEN, 1991; KRETSCHMER, 1985; QURASHI, 1991; 1993; SEGLEN & AKSNES, 2000). However, we should be careful in defining the level of observation. First of all, indivisibility is more important at the level of team or laboratory than at the level of institute or department. Second, while the notion of indivisibility is clear in abstract terms, its empirical relevance may be highly variable. In other words, the minimum size of a team or laboratory may be extremely variable across specific areas within the same fields. In general, this means that economies of scale may be important up to a threshold level, then become irrelevant. If the threshold level is quite small (in the order of a few units or a few dozens), the practical implication is that even small institutes may be highly efficient, provided that their teams or labs meet the minimum requirement (KYVIK, 1995). At the same time it must be recognized that size may have strong benefits in terms of a broader notion of organizational support. This does not only include direct resources employed in scientific production, such as research assistants, technicians, or equipment, but also shared resources such as libraries and facilities, and more importantly, indirect resources such as competent colleagues. Here the argument is that a larger availability of these resources may facilitate discovery or scientific productivity. In particular, it is easier to find top level scientists in large than in small laboratories or universities. In a study of scientific discovery by 16 Nobel laureates, HURLEY (1997, p. 76) has found that “in terms of budget, library resources, technical support and the availability of exceptional colleagues, these laboratories (where Nobel laureates worked) are organisationally very rich indeed”. While there is merit in this argument, the causal assumption must be made clear. It is true that talented scientists are attracted by places where resources are abundant, but this is unlikely to be the most important factor. In a long run perspective, it is talent that creates resources (and then organizational size), rather than size that produces talent. Physical infrastructure. Access to physical infrastructure is another argument commonly associated to the call for critical mass and concentration of resources in large institutions. Here the empirical counterpart is the so called big science, in which the cost of research instrumentation is very high. No one denies the importance of this phenomenon. However, it cannot be invoked as a general argument in favour of large institutes. More subtly, in big science the use of large experimental facilities is almost exclusive, so that institutions must guarantee their ownership. This is not so, for example, in fields such as genomics and proteomics. Here large research facilities such

Scientometrics 63 (2005) 93 A. BONACCORSI, C. DARAIO: Size and agglomeration effects as databases are also necessary, but their use is not exclusive and can be made available also to small institutions on a contractual basis. The link between size of infrastructure and size of institution may be broken. Empirical evidence. Summing up, there are many arguments for assuming that increasing returns to scale are at play at the level of institutes. However, these arguments cannot be deduced analogically from similar factors in manufacturing production, because the economic structure of the two fields is profoundly different. Similarities are somewhat superficial, while differences in complementarities and coordination patterns are deep. On the other hand, even in industrial economics the existence and relevance of economies of scale is ultimately an empirical matter (PRATTEN, 1971). Several studies have examined the relation between size and research productivity or higher education productivity (BRINKMAN, 1981; BRINKMAN & LESLIE, 1986; COHN et al., 1989; DE GROOT et al., 1991; LLOYD et al., 1993; NELSON & HEVERT, 1992; GETZ et al., 1991). The evidence on returns to scale in scientific production is ambiguous. ADAMS & GRILICHES (2000) find constant returns to scale at the level of university in the case of United States, while NARIN & HAMILTON (1996, p.297) review several studies and conclude “we have never found that the size of an institution is of any significance”. In the conclusion of the review commissioned by the UK Office of Science and , VON TUNZELMANN et al. (2003) state: “there seems to be little if any convincing evidence to justify a government policy explicitly aimed at further concentration of research resources on large departments or large universities in the UK on the grounds of superior economic efficiency”. On the other hand, other studies find increasing returns to scale until a threshold level, after which constant or even decreasing returns describe better the situation. As JOHNSTON (1994) summarises the literature, “the results of this body of work can best be characterized as ambiguous and contradictory. The majority verdict is that research output is linearly related to size with no significant economies of scale apparent. Others have argued that the relationship between output and size is more complicated – for example, that there are economies of scale up to a certain group size after which diseconomies set in” (JOHNSTON, 1994, p. 32). Despite the fact that empirical evidence is not conclusive, the notion that economies of scale matter in scientific production is firmly held in many political circles and inspires important political and administrative decisions. For this reason it is useful to add further evidence. Also, while there is some evidence on universities, much less is known with respect to institutes of large public research organisations, such as CNRS, CNR or Max Planck. This paper gives a contribution by examining this issue with respect to non-university public research institutions.

94 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Agglomeration economies

The notion of scientific districts, clusters, poles of excellence or science areas has been prominent in national and regional in the last twenty years. The fascinating examples of Silicon Valley and Route 128 (SAXENIAN, 1996) and the emergence of technopoles and regional clusters (CASTELLS & HALL, 1994; COOKE & MORGAN, 1998) have catalysed the attention of analysts and policy makers in all advanced countries. At a regional level the notion of cluster identifies the co-presence and interaction of diverse subjects such as research and educational institutions, firms, innovative public administrations, financial services, and other intermediary organisations (ACS, 2000; SCOTT, 2001). At this level the emphasis is not on clustering of research activities per se, but on clustering of complementary innovative activities in the same area. This general notion, however, has also inspired policies of location of research activities by some large public research institutions. In several countries large public research institutions have pursued a policy of creating geographical concentrations of institutes in the same area. For example in Italy CNR promoted the creation of Research Areas, large agglomerations of institutes in different fields within the same physical infrastructure. In France most research institutes at CNRS and INSERM are located in close areas. Behind these policies there is the idea that proximity favours scientific productivity, insofar as it maximises personal interaction, face-to-face communication, on-site demonstrations and transmission of tacit knowledge, as well as it facilitates identification of complementary competencies, unintentional exchange of ideas, café phenomena, and other serendipitous effects. The focus of our discussion is therefore the notion that concentrating research activities in the same area may bring benefits to scientific productivity. We do not enter into a discussion on more general policies for clustering and agglomeration of innovative activities. Underlying these policies there are some well grounded economic ideas. As it some- times happens, the original idea is an old one, but it was rediscovered and enlarged more recently. The implicit economic analogy is with the concept of external economies, or Marshallian agglomeration economies (MARSHALL, 1920; KRUGMAN, 1991; PYKE et al., 1986). Alfred Marshall observed that the concentration of a large number of manufacturing firms in the same area (industrial district) is not due to chance, but reflects the presence of local externalities in the form of availability of specialised suppliers, highly trained workforce, sources of innovative ideas. Costs of production are therefore lower in an agglomerated area than outside it. More importantly, firms in a district enjoy a particular industrial atmosphere and benefit from processes of collective invention.

Scientometrics 63 (2005) 95 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

The literature on agglomeration economies has been carefully reviewed by ROSENTHAL & STRANGE (2004). The available evidence suggests that all three sources of agglomeration economies suggested by MARSHALL (1920) are present, namely labor market pooling, input sharing, and knowledge spillovers. The literature that has examined the impact of knowledge spillovers has tried to explain agglomeration processes as the result of intrinsic limits to the geographic mobility of technological and scientific knowledge. Here the main emphasis is on the fact that the diffusion of knowledge may take place via codification and distance transmission, but in most cases requires also personal acquaintance and face-to-face interaction. This is made easier and cheaper by physical proximity. Since there is complementarity between codified and tacit knowledge, even in science physical proximity may be important. The idea is therefore that knowledge flows have an embedded nature and require physical proximity which facilitates exchange of experience and interpersonal communication. In a path-breaking work, JAFFE et al. (1993) studied the structure of citations to and found that the number of citations sharply declines with distance from the site of inventors. Citations are 5 to 10 times more frequent in the same area. Similar results have been found by JAFFE (1989), ACS et al. (1992), and ALMEIDA & KOGUT (1999), ZUCKER et al. (1998); AUTANT- BERNARD (2001). AUDRETSCH & FELDMAN (1996) found a positive relation between geographic concentration of industries and the R&D/sales ratio and proportion of skilled labor, consistent with the notion that knowledge spillovers influence agglomeration. BOTTAZZI & PERI (2003) measured the decay of knowledge spillover at regional level and found that the effect goes down to zero at approximately 300 km from the source. On the other hand, this literature has been somewhat vague on the specific mechanisms that link knowledge spillovers and agglomeration, failing to provide compelling evidence of a general effect. A more specific effect has been suggested by HUSSLER & RONDE (2004): in epistemic communities there is the need to negotiate meanings because they do not share the same cognitive frame ex ante (COWAN et al., 2000) but have to build it through interaction. Therefore physical proximity may be a factor, while in communities of practice coordination is ensured by shared or equipment, making proximity less important. They find evidence of this pattern on data about French inventors. Besides, BRESCHI & LISSONI (2004) found that the pattern of citations from patents to patents follows interpersonal relations, as evidenced by networks of co-invention, and not necessarily geographic proximity. In sum, while the existence of knowledge spillovers is generally accepted, the channels through which they are diffused are much less clear. Therefore the impact of agglomeration has still to be demonstrated. No one denies that concentrating many research institutions in the same area may have benefits in terms of administrative activity, logistics, emergence of specialised

96 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects services and the like. Some facilities (libraries, technical services) require a minimum size to function efficiently and cannot be replicated across many small units scattered around a country. In addition, facilitating personal interaction may indeed spur creative activity. Under many respects, a policy of intentional agglomeration of research units in the same geographic area makes economic sense. The problem is one of causality assumptions. Underlying most agglomeration policies is the idea that geographic closeness may increase productivity. The assumed causality direction goes from agglomeration to productivity. However, this causal mechanism should not be taken for granted. On the one hand, the mere existence of agglomeration does not imply the existence of agglomeration economies. Another mechanism may be at play, going in the reverse direction, from scientific productivity to agglomeration. As an example, one can claim that scientific excellence creates its own agglomeration effects. When a laboratory or a scientist in a given place open promising lines of research, PhD students and post-doc move from other universities and choose to invest their initial career in that place, visiting scholars spend periods of training, visiting professors are eager to deliver seminars and suppliers of scientific instrumentation visit periodically the location. If the institutional scientific system is sufficiently flexible, the scientist will receive support for infrastructure and his laboratory will grow and attract further people. The choice of the initial location may happen by chance or historical contingency, rather than being planned rationally. Because physical facilities must follow the constraints placed by administrators, laboratories are often located close to each other. When we observe the phenomena over time we are tempted to conclude that scientists working in agglomerated areas are more productive, but the reverse is true: productive scientists create dynamically their own agglomeration effects. If this is true, a policy of agglomeration should not confound the causes with the effects. Agglomeration per se does not have any meaning for scientific productivity. On the other hand, the question of empirical relevance of agglomeration economies should not be overlooked. How severe is the disadvantage for a laboratory to work in relatively isolated areas? Is the concentration of research activities in the same area or rather the quality of life that attracts talented scientists in a given location? Empirical evidence on localized knowledge spillovers should not be interpreted in the sense that proximity is a necessary condition for transmission of knowledge. For this to be true, one must show cases in which physical distance has actually precluded the transmission of knowledge. We are not aware of these cases. Summing up, there are many good reasons for a policy of agglomeration of research activities in the same geographic area. At the same time the importance of agglomeration is an inherently empirical matter and should be evaluated case by case. More importantly, policies should not assume implicitly a causal mechanism being in place.

Scientometrics 63 (2005) 97 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Our paper gives a contribution to this debate by testing the existence of economies of scale and agglomeration with reference to non-university research institutions. This empirical setting is particularly interesting, for several reasons. First, unlike universities that enjoy a large degree of autonomy, public research institutions receive most of their budget from the government and allocate it to institutes, following a centralised procedure. So they are in a better position to influence the size of institutes, if they believe this is better for productivity reasons. Second, unlike universities that have a long historical tradition, institutes from public research institutions may be located in many different places. So they have the choice to promote policies of agglomeration of institutes in closely related areas or to scatter institutes throughout the country. For these reasons the evidence presented in this paper should be of interest not only to scholars of science but also to policy decision makers.

Data description

National Research Council – CNR (Italy)

We constructed an original dataset by integrating three official documents produced by CNR in recent years: • Report on the CNR scientific activity in 1997 (published in 1998); • Report on the CNR Personnel in 1997 (internal documentation); • Report on the CNR European research funding. The integration of these data was not a trivial task. The documentation on personnel gives biographical data on individual researchers, technicians and administrators, together with the CNR affiliation in 1997. We assigned all reported individuals to institutes and integrated these data (input data) with those reported in the official Report, which include both input data and output data. Input data include, for example, research funds, funds from external sources or total costs while output data include total number of publications and number of international publications. Interestingly, the Report does not include data on personnel by institute. In practice, until now there was no official document that gave the opportunity to merge the information on scientific production with information on the structure of research units. The research areas considered in the analysis are listed in Table 1.

98 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Table 1. Research areas Code Research area A1 Agriculture A2 Environment and habitat A3 Biotechnologies and molecular biology A4 Chemistry A5 Economics, and statistics A6 Physics A7 Geology and mining A8 Engineering and architecture A9 and technology A10 Mathematics A11 Medicine and biology A12 Law and Politics A13 , and philology

In order to conduct the analysis by areas with a sufficient number of observations we carried out the following consolidation, keeping into account broad disciplinary fields from the academic tradition (see Table 2): • Environment and Habitat together with Geology and mineral science; • Biotechnologies and molecular biology together with Medicine and biology; • Engineering and architecture with Innovation and technology. These aggregations follow the Italian academic tradition, in which these disciplines are taught together in the same schools or polytechnics. In recent years (2003–2004), CNR has started a major internal restructuring, leading to the creation of large research areas, comprising several institutes. Interestingly, the aggregation adopted by CNR corresponds to the one adopted here, with two minor differences: we separate Chemistry and Physics (which in the restructuring of CNR are considered part of the Basic Science area), and we keep Agriculture separated from other Life . Institutes in Mathematics (A10), Law and Politics (A12) and History, philosophy and philology (A13) have been excluded from the analysis.

Table 2. Aggregation of research areas Aggregation Corresponding research area No. of obs. MA1 Agriculture 24 MA2 Environment and habitat and Geology and mining 26 MA3 Biotechnologies and molecular biology and Medicine and biology 27 MA4 Chemistry 26 MA5 Physics 28 MA6 Engineering and architecture and Innovation and technology 31

Scientometrics 63 (2005) 99 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Table 3. Variables in the dataset (all variables refer to CNR institutes) a) Size indicators Variable Definition T_PERS Total number of personnel RESFUN Total research funds T_COS Total costs LABCOS Labour costs

b) Personnel indicators Variable Definition T_RES Total number of Researchers TECH Number of Technicians ADM Number of Administrative Staff ORD_RES Number of Researchers SEN_RES Number of Senior Researchers DIR_RES Number of Research Directors

c) Scientific Productivity indicators Variable Definition T_PUB Total number of publications P_INTPUB Percent international publications INTPUB Number of International Publications PUBPERS Publications per capita IPUPERS International Publications per capita PUBRES Publications per researcher IPURES International Publications per researcher

d) Other indicators Variable Definition P_MARFUN Percent of funds raised from the market P_INV Percent of Total costs allocated to investment COPUB Cost per publication COPUBINT Cost per international publication AVIM Average Impact factor GAI Geographical Agglomeration Index Source: CNR Report (1998) and our elaboration

The list of variables considered in the analysis is reported in Table 3. These variables have been selected with the goal of allowing a careful test of hypothesis regarding the sign and magnitude of the impact of size and agglomeration effects on various measures of scientific productivity (see the analysis discussed in the next sections). All variables refer to individual institutes. We strictly follow the definition of variables described in the CNR Report but omit some variables not used in this paper (e.g. age structure of researchers). Monetary variables are left in Italian lira (1 euro= 1936,27 lira). Manipulations of variables are described explicitly.

100 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

INSERM (France)

The INSERM database collects data on the number of researchers and publications of the INSERM institutes in 1997. The sample is based on 213 observations, which is almost the entire universe of institutes. We were able to access data on institutes by visiting systematically websites and by addressing a mail survey to directors in 1999. Although data refer to one year only, they offer a comprehensive view of the activity of a large part of the French biomedical research system. The number of researchers is divided in three categories (INSERM researchers, researchers from hospital and university, other researchers), in addition post-doc students (boursier) and technical-administrative personnel are included. For all institutes we define a geographical classification (see later). Although the INSERM dataset is less rich than the CNR dataset, there is a subset of variables that is in common. The definition of variables is reported in Table 4.

Table 4. Variables in the dataset (all variables refer to INSERM institutes) a) Size indicators Variable Definition T_RES Total number of researchers T_PERS Total number of personnel TA_PERS Technical and administrative personnel INS_RES INSERM researchers OTH_RES Other researchers HU_RES Hospital/university researchers BORS Doc and post-doc students or scholarship holders (boursier) b) Scientific productivity and agglomeration indicators Variable Definition INTPUB Number of International Publications IPUPERS International Publications per capita IPURES International Publications per researcher GAI Geographical agglomeration index Source: our elaboration on websites and electronic survey

Limitations of data

The limitations of the two datasets should not be underestimated. First of all, data refer to just one year for both CNR and INSERM. In the literature on and the economics of science it is well known that data on scientific publications should be averaged over some years, in order to take into account the inherent variability of the phenomenon over time.

Scientometrics 63 (2005) 101 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

All in all, the size of the two samples is so large and the aggregation by institute so fine that a picture over one year can still be considered reliable, at least with regards to broad patterns. Second, to make a meaningfully analysis one must have data or proxies for all influencing variables. Now, in the analysis of scientific productivity most studies, and also this paper, do not use any proxy for capital equipment, and this constitutes a strong limit of the analysis. Third, we take as a definition of scientific production the number of total and international publications. For this research we had no access to data on individual publication nor we could control for citations of CNR and INSERM publications.∗ In addition, we recognise that the output of activity is not limited to scientific publications but also includes patents, consulting, technology transfer to industry, hospitals and public administration in general, and, to a limited extent, teaching and the creation of spin-off companies. We do not have data on these joint outputs and are forced to stick to a view of output as represented by publications. However, we believe that the view that the main institutional output of CNR and INSERM should be scientific publications is fundamentally correct. Finally, all variables refer to individual institutes. No evidence is available on research teams and laboratories within institutes. This limitation should be clearly taken into account in examining the results.

Size effects: Does scientific productivity depend on size of institutes?

Evidence from CNR

We want to test the hypothesis that average scientific productivity of researchers is positively influenced by the size of the institute to which they are affiliated. We computed Pearson correlation coefficients between couples of variables.∗∗ Because we have to test a clear Ho, we are happy with very simple correlation analysis. Our aim is not to build a model of scientific productivity, for which data on all inputs should be included. More modestly (but more correctly from a methodological point of view), we work on the pars destruens, trying to demonstrate that assumed effects in scientific research quite simply fail to meet even the weakest statistical test.

∗ Further research is currently undergoing with the objective to use measures of individual productivity of scientists and to relate them to productivity at the level of institutes. ∗∗ In a related paper with an explicit comparative approach (BONACCORSI & DARAIO, 2003a) we use Data Envelopment Analysis (DEA), Free Disposal Hull (FDH) and robust nonparametric techniques (order-m frontiers), which do not ask for a functional specification; see also BONACCORSI & DARAIO (2003b).

102 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Table 5. Correlation between size of institutes and indicators of scientific output and productivity at CNR institutes Variable T_PUB P_INTPUB INTPUB IPURES IPUPERS PUBRES T_RES 0.759** 0.081 0.743** –0.191** –0.236** –0.255** T_PERS 0.722** 0.020 0.684** –0.193** –0.286** –0.230** ADM 0.581** –0.136 0.457** –0.180* –0.269** –0.122 TECH 0.620** –0.003 0.586** –0.171* –0.289** –0.197** ORD_RES 0.640** 0.117 0.635** –0.193** –0.218** –0.272** SEN_RES 0.634** –0.003 0.602** –0.144* –0.182* –0.168* DIR_RES 0.587** 0.100 –0.597** –0.092 –0.163* –0.142 LABCOS 0.734** 0.035 0.705** –0.177* –0.263** –0.216** T_COS 0.707** 0.041 0.702** –0.111 –0.197** –0.147* RESFUN 0.539** 0.039 0.560** –0.021 –0.089 –0.048 ** Pearson Correlation is significant at the 0.01 level (2-tailed). * Pearson Correlation is significant at the 0.05 level (2-tailed).

Table 6. Correlation between size of institutes and indicators of cost, impact factor, and market funds at CNR institutes Variable COPUB COPUBINT AVIM P_MARFUN T_RES 0.116 0.022 0.073 0.043 T_PERS 0.180* 0.091 0.024 0.079 ADM 0.122 0.102 –0.035 –0.079 TECH 0.217** 0.131 –0.006 0.126 ORD_RES 0.121 –0.002 0.061 0.189** SEN_RES 0.075 0.055 0.043 –0.104 DIR_RES 0.075 –0.017 0.106 –0.064 LABCOS 0.168* 0.077 0.046 0.063 T_COS 0.180* 0.081 0.031 0.281** RESFUN 0.156* 0.069 0.010 0.448** ** Pearson Correlation is significant at the 0.01 level (2-tailed). * Pearson Correlation is significant at the 0.05 level (2-tailed).

The results are reported in Table 5 and 6 for correlations on the whole CNR, and in Appendix B for correlations by Research Area. They are quite clear: • in no scientific area is size positively correlated to productivity; • in 3 out of 6 large scientific areas (chemistry, environment, physics) size as measured by total number of researchers is negatively and significantly correlated to productivity (number of international publications per researcher); • in 4 out of 6 areas (agriculture, environment, chemistry, physics) size as measured by total number of personnel is negatively and significantly correlated to productivity (number of international publications per unit of personnel); • in two areas in which indivisibility and large infrastructures may be at stake (i.e. medicine and engineering) the relation is not statistically significant, nevertheless it has a negative sign; contrary to the common wisdom, in almost all areas the most productive institutes are not found in the largest size classes, but in the small ones.

Scientometrics 63 (2005) 103 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

In agriculture, environment, chemistry and physics, the most productive institutes have 5–6 researchers. In medicine and biology one of the stars has around 10 researchers, but highly productive institutes can also be found in the range 5–10 researchers. In general, there is no positive relation between size and productivity. Although the most productive institutes are likely to be found in small size classes, the least productive are spread across all sizes. Interestingly, the distributions of cost per publication and cost per international publication are again highly skewed. We are interested in checking whether the highly productive institutes are also those that spend more per publication. Clearly, if such a relation would hold, then a possible explanation for higher productivity would not lie in organizational factors or in the quality of the scientific environment, but rather in greater access to funds, complementary personnel, or external resources. The opposite holds true. Highly productive institutes spend less resources than less productive ones (see Appendix B). Scientific productivity is not originated by a stronger consumption/utilization of resources. As it is clear, Pearson coefficients give a rough global measure of association. They are all that is needed to reject the notion of global economies of scale in science. We are also interested, however, in exploring local effects, that may be valid for a region within the interval of relevant independent variable. Rather than applying standard regression tools we use a nonparametric technique. The methodological choice is consistent with the notion that production functions, and hence the standard parametric regression- based econometric toolbox, suffer from severe conceptual problems and cannot be accepted for the economic analysis of science (BONACCORSI & DARAIO, 2004). Therefore we apply a Locally weighted least-squares (Loess) technique (see CLEVELAND, 1993; 1994). This is a local nonparametric regression method based on a generalization of running means. The technique gets a predicted value at each point by fitting a weighted linear regression, in which the weights decrease with the Euclidian distance from the point of interest. Connecting these predicted values produces a smooth curve. This method is interesting because it shows the existence of local effects in the causal relation between variables, that would be overlooked by an average pattern in a standard parametric regression framework and clearly cannot be detected by simply using Pearson coefficients. In addition, Loess techniques provide a useful graphical representation, which facilitates a visual identification of local patterns. A locally weighted least-squares regression is hence used to obtain smoothed values on a scatter plot of the associated points of value of y, given the values for x (see Figure 1 and Appendix A). In Figures 1a and b the x axis shows the size of the institute in terms of researchers and total personnel, respectively, and the y axis the productivity of researchers and of total personnel, respectively, in terms of international publications. A visual inspection

104 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects of the plots shows that the initial interval is characterized by slightly decreasing returns to scale, while the rest of the size distribution is characterized by constant returns almost everywhere. In no region we can see segments of the plot witnessing increasing returns to scale.

Figure 1 Loess plots of size vs. productivity indicators – whole CNR (187 Institutes) a) Size (T_RES) vs. productivity indicators (IPURES) b) Size (T_PERS) vs. productivity indicators (IPUPERS)

Evidence from INSERM

The same methodological approach was followed for the French INSERM.

Scientometrics 63 (2005) 105 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Table 7 shows simple correlation coefficients between several indicators of size and productivity indicators based on categories of personnel. The results show that size effects are weakly negative for all researchers (T_RES) and all units of personnel (T_PERS). Total output grows linearly with all categories of personnel. Figure 2 shows the Locally weighted least-squares (Loess) curve fitting of productivity indicators versus size variables. Again, the visual inspection of Figures 2a and b shows that there is no region in the interval of size variables (number of researchers, or number of personnel) in which increasing returns emerge.

Figure 2. Loess plots of size vs. productivity indicators – INSERM a) Size (T_RES) vs. productivity indicators (IPURES) b) Size (T_PERS) vs. productivity indicators (IPUPERS)

106 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Table 7. Correlation between size of institutes and indicators of scientific output and productivity at INSERM institutes Variable INTPUB IPUPERS IPURES T_RES 0.547** 0.056 –0.172* T_PERS 0.585** –0.021 0.012 TA_PERS 0.499** 0.006 0.119 INS_RES 0.214** –0.015 0.007 OTH_RES 0.387** 0.046 –0.162* HU_RES 0.385** 0.054 –0.123 BORS 0.339** –0.126 0.073 ** Pearson Correlation is significant at the 0.01 level (2-tailed). * Pearson Correlation is significant at the 0.05 level (2-tailed).

Discussion of results

These results go directly against much of received wisdom in science policy making. To put it simply, there is no evidence on the existence and importance of increasing returns to scale in scientific research at the level of institute. On the contrary, there is evidence of weak decreasing returns. Policies aimed at consolidating institutes or policies of concentration of funds on large institutes might be justified on the ground of cost savings in administrative staff, but could have no justification with respect to the impact on scientific production. More precisely, we propose that the level at which increasing returns apply is not the institute, but the research team. At this level factors such as the access to physical capital, the number of complementary scientific competencies and the extent of division of labour significantly influence scientific productivity. Although there is only preliminary evidence on this effect,∗ we draw the attention to the possibility that most policy discussions on critical mass and concentration of resources may be directed to the wrong target. It is not the administrative unit that matters, but the team and the laboratory. While in a few scientific fields research teams are defined around large physical infrastructures, so that administrative units and teams largely overlap, in most fields this is not the case. Pursuing a policy of concentration into larger institutes may miss the point, unless institutes adopt an internal policy of rewarding scientific excellence of teams by selectively allocating the internal resources.

∗ A formal test of the effect of team size on productivity would require micro-data that are extremely difficult to collect. We have very preliminary evidence, based on a subset of INSERM institutes for which we have data on the number and size of teams (n=72). Having controlled that this subset is not significantly different from the rest of the sample, we run correlation analysis between productivity (PUB_RES), size of the institute (T_RES) and size of the team, respectively. Interestingly, Pearson coefficient is positive and significant for the size of the team.

Scientometrics 63 (2005) 107 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Even worse, there is the possibility that the concentration of institutes reduces productivity. If there are regions in the size interval where decreasing returns to scale apply, it is possible that the consolidation leads ‘efficient’ institutes in regions of lower efficiency. This possibility would be overlooked by considering only average relations between size and productivity. In any case, a policy of consolidation of institutes makes sense if and only if it is associated to a policy for promoting adequate size of research teams and laboratories. On the contrary, promoting the adequate size of teams and laboratories requires a policy of recognition of scientific talent whenever and wherever it is demonstrated, which almost invariably means without formal central planning. Policy makers and administrators of large public research institutions feel more confident with discretionary planning than recognition of quality. Building large institutes is politically easier than allowing promising teams to grow whatever their institute.

Agglomeration effects: Does scientific productivity depend on geographical concentration of institutes?

Evidence from CNR

To account for the influence of proximity between research institutes we constructed the Geographical Agglomeration Index (GAI) as follows. To each institute we assigned one point for each other CNR institute located in the same city that is not of the same research aggregation; and two points for each other CNR institute located in the same city that is also of the same research aggregation of the institute considered. Then we obtained a GAI that goes from 39 to 1, varying between 39 and 33 for the institutes located in Rome, from 23 to 20 for the institute located in Naples, from 16 to 14 for the institutes located in Pisa and so on. An institute has a GAI of 1 if it is the only CNR institute in its own town. Then we tested the existence of a relation between GAI and several measures of scientific productivity. Results are shown in Table 8. See Appendix B for results per research area. As it is clear from Table 8, there is no evidence that institutes that benefit from a strong agglomeration effect do have higher productivity.

Table 8. CNR Correlation between GAI and indicators of scientific productivity whole CNR Variable IPURES IPUPERS PUB_PERS PUB_RES INTPUB GAI 0.051 –0.012 –0.005 0.068 0.151* ** Pearson Correlation is significant at the 0.01 level (2-tailed). * Pearson Correlation is significant at the 0.05 level (2-tailed).

108 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Evidence from INSERM

In the case of INSERM we construct the Geographical Agglomeration Index (GAI) in the same way than for CNR, assigning a score of 2 for institutes in the same category within biomedical research. The absolute level of GAI is clearly not comparable between INSERM and CNR, but this does not affect the results. Correlation analysis of GAI and several productivity indicators are shown in Table 9.

Table 9. INSERM Correlation between GAI and indicators of scientific productivity Variable IPURES IPUPERS INTPUB GAI 0.150* 0.217** 0.161* ** Pearson Correlation is significant at the 0.01 level (2-tailed). * Pearson Correlation is significant at the 0.05 level (2-tailed).

In this case we find evidence of a positive effect even if it is not so strong. It seems that institutes located in the same area are more productive, while isolated institutes suffer.

Discussion of results

The combined evidence on the impact of agglomeration on scientific productivity is mixed. Most productive institutes at CNR are not necessarily located close to other institutes. At the same time, isolated institutes at INSERM are sacrificed in their productivity. A possible explanation of the observed effect is in the difference in the institutional linkage with universities. In the French system, large public research organisations such as CNRS, INSERM or INRA only during the ‘90s were put in systematic relation with universities, through the creation of joint institutes, exchange of researchers and the like. In the Italian CNR, on the contrary, the linkage with universities has historically been very strong. This means that an institute located outside a CNR Research Area but close to a good university may benefit from positive effects, while this is more difficult for INSERM institutes. Given that CNR data cover many scientific sectors, we tend to give them more weight in balancing the evidence. Summing up, the evidence do not support the received wisdom that agglomeration per se is positive. It reveals a conceptual flaw in the argument of agglomeration: it is not agglomeration that induces scientific productivity, but rather the quality of research that attracts other scientists and induces agglomeration effects.

Scientometrics 63 (2005) 109 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Conclusions

We found no support at all for size effects and no strong support for agglomeration effects. Although in some scientific fields local effects due to scale and agglomeration have been identified (see Appendix A for the graphical inspection of this local effects in the CNR case, at disaggregated level), they are clearly restricted to small regions in the size interval. As it is clearly shown by the plots, no general pattern emerges from data supporting the scale and agglomeration effects. The argument that scientific productivity is favoured by concentration of resources into larger institutes, and geographical agglomeration of institutes in the same area does not receive empirical support. If this is the case, why policies aimed at critical mass, concentration and agglomeration are so diffused? A possible interpretation is that decisions about the size and the location of institutes are among the few in which a full discretionary power of politicians, government officials and public research central bureaucracies can be exercised. Deciding where to locate new institutes and how large they must be is a source of significant power, that can be shared among interested parties (politicians, administrators, scientists). A more benevolent interpretation is that the top management of large public research organisations face strong pressures for bringing research activities into new regions, particularly less developed regions. Having a strong argument in favour of concentration and agglomeration may help to resist fragmentation tendencies. It must be stressed again that policies aimed at concentration and agglomeration may have (and indeed often have) strong merits from the point of view of administrative and organisational efficiency. Unfortunately, the causal mechanism implicitly assumed in these policies do not hold from an empirical point of view. These policies should always be pursued with a clear view to the need to promote scientific excellence, whatever the size and location involved. Evidence-based science policy should consider carefully these points.

*

Part of the evidence of this paper has been presented at the conference Rethinking Science Policy, held at the SPRU (Brighton, 21–23 March, 2002), at the 7th International Science and Technology Indicators Conference (Karlsruhe, 25–28 September 2002), at seminars at ISPRI-CNR (Rome) and INSERM (Marseille) and further developed within the AQuaMethPSR (Advanced Quantitative Methods for the evaluation of Public Sector Research) project under the PRIME Network of Excellence, 6th Framework Programme. We thank participants for stimulating comments. We would like to thank Marco Brancher for assistance in building the database. Work partially supported by the Italian Registry of ccTLD.it. We gratefully acknowledge the helpful suggestions of two anonymous referees. The usual disclaimers apply.

110 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

References

ACS, Z. J., AUDRETSCH, D. B., FELDMAN, M. P. (1992) Real effects of academic research. Comment, American Economic Review, 82 (1) : 363–367. ACS, Z. (Ed.) (2000), Regional Innovation, Knowledge and Global Change. London, Pinter. ADAMS, J. D., GRILICHES, Z. (2000), Research productivity in a system of universities, In: D. ENCAOUA et al. (Eds), The Economics and Econometrics of Innovation, Kluwer, Dordrecht, pp. 105–140. ALMEIDA, P., KOGUT, B. (1999), Localization of knowledge and the mobility of engineers in regional networks, Management Science, 45 (7) : 905–917. AUDRETSCH, D. B., FELDMAN, M. P. (1996), R&D spillovers and the geography of innovation and production, American Economic Review, 86 (3) : 630–640. AUTANT-BERNARD, C. (2001), Science and knowledge flows; Evidence from the French case, Research Policy, 30 : 1069–1078. BONACCORSI, A., DARAIO, C. (2003a), A robust nonparametric approach to the analysis of scientific productivity, Research Evaluation, 12 (1) : 47–69. BONACCORSI, A., DARAIO, C. (2003b), Age effects in scientific productivity. The case of the Italian National Research Council (CNR), Scientometrics, 58 : 47–88. BONACCORSI, A., DARAIO, C. (2004), Econometric approaches to the analysis of productivity of R&D systems. Production functions and production frontiers, In: H. F. MOED, W. GLÄNZEL, U. SCHMOCH (Eds), Handbook of Quantitative Science and Technology Research, Kluwer, Dordrecht, pp. 51–74. BOTTAZZI, L., PERI, G. (2003), Innovation and spillovers in regions: Evidence from European data, European Economic Review, 47 (4) : 687–710. BRESCHI, S., LISSONI, F. (2004), Knowledge networks from patent data, In: H. F. MOED, W. GLÄNZEL, U. SCHMOCH (Eds), Handbook of Quantitative Science and Technology Research, Kluwer, Dordrecht, pp. 613–644. BRINKMAN, P. T. (1981), Factors affecting instructional costs at major research universities, Journal of Higher Education, 52 : 265–279. BRINKMAN, P. T., LESLIE, L. L. (1986), Economies of scale in higher education: Sixty years of research, The Review of Higher Education, 10 (1) : 1–28. CASTELLS, M., HALL, P. (1994), Technopoles of the World. The Making of the 21st Century Industrial Complexes. London, Routledge. CLEVELAND, W. S. (1993), Visualizing Data, Hobart Press, New Jersey. CLEVELAND, W. S. (1994), The Elements of Graphing Data, Hobart Press, New Jersey. COHEN, J. E. (1991), Size, age and productivity of scientific and technical research groups, Scientometrics, 20 : 395–416. COHN, E., RHINE, S. L. W., SANTOS, M. C. (1989), Institutions of higher education as multi-product forms: Economies of scale and scope, Review of Economics and Statistics, 71 (May) : 284–290. COLE, S., COLE, J., SIMON, G. (1981), Change and consensus in peer review, Science, 214 : 881–886. COOKE, P., MORGAN, K. (1998), The Associational Economy. Firms, Regions and Innovation. Oxford, Oxford University Press. COWAN, R., DAVID, P. A., FORAY, D. (2000), The explicit economics of knowledge codification and tacitness, Industrial and Corporate Change, 9 : 211–254. DE GROOT, H., MCMAHON, W. W., VOLKWEIN, J. F. (1991), The cost structure of American research universities. Review of Economics and Statistics, 424–451. GETZ, M., SIEGFRIED, J. J., ZHANG, H. (1991), Estimating economies of scale in higher education, Economics Letters, 37 : 203–208. HALSEY, A. H. (1980), Higher Education in Britain – A Study of University and Polytecnhnic Teachers, Final report on SSRC Grant. HURLEY, J. (1997), Organisation and Scientific Discovery, Wiley, Chichester NY.

Scientometrics 63 (2005) 111 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

HUSSLER, C., RONDE, P. (2004), When cognitive communities check the diffusion of academic knowledge: Evidence from the networks of inventors of a French university, paper presented at the Workshop. The Empirical Economic Analysis of the Academic Sphere, March 17, 2004, BETA, Univ. Louis Pasteur, Strasbourg (France). JAFFE, A. B. (1989), Real effects of academic research, American Economic Review, 79 (5) : 957–970. JAFFE, A. B., TRAJTENBERG, M., HENDERSON, R. (1993), Geographic localization of knowledge spillovers as evidenced by patent citations, Quarterly Journal of Economics, 108 : 577–598. JOHNSTON, R. (1994), Effects of resource concentration on research performance, Higher Education, 28 (1) : 25–37. KRETSCHMER, H. (1985), Cooperation structure, group size and productivity in research groups, Scientometrics, 7 (1–2) : 39–53. KRUGMAN, P. (1991), Increasing returns and economic geography, Journal of Political Economy, 99 (3) : 483–499. KYVIK, S. (1995), Are big universities departments better than small ones? Higher Education, 30 (3) : 295–304. LAREDO, P., MUSTAR, P. (Eds) (2001), Research and Innovation Policies in the New Global Economy. An International Comparative Analysis, Edward Elgar. LATOUR, B., WOOLGAR, S. (1979), Laboratory Life, Sage, London. LINK, A. N. (1996), Economic performance measures for evaluating government sponsored research, Scientometrics, 36 : 325–342. LLOYD, P., MORGAN, M., WILLIAMS, R. (1993), Amalgamations of universities: Are there economies of size and scope? Applied Economics, 25 : 1081–1092. MARSHALL, M. (1920), Principles of Economics, London, MacMillan. MARTIN, S. (2002), Advanced Industrial Economics, Blackwell Publishers, Malden. MILGROM, P., ROBERTS, J. (1992), Economics, Organization and Management, Prentice Hall, Englewood Cliffs. NARIN, F., HAMILTON, K. S. (1996), Bibliometric performance measures, Scientometrics, 36 : 293–310. NELSON, R., HEVERT, K. T. (1992), Effect of class size on economies of scale and marginal costs in higher education, Applied Economics, 24 : 473–482. PRATTEN, C. F. (1971), Economies of Scale in Manufacturing Industry, Cambridge University Press, Cambridge. PYKE, F., BECATTINI, G., SENGENBERGER, W. (1986), Industrial Districts and Inter-Firm Co-operation in Italy, International Labour Office, Geneve. QURASHI, M. M. (1991), Publication-rate and size of two prolific research groups in departments of inorganic-chemistry at Dacca University (1944–1965) and zoology at Karachi University (1966–84), Scientometrics, 20 (1) : 79–92. QURASHI, M. M. (1993), Dependence of publication-rate on size of some university groups and departments in UK and Greece in comparison with NCI, USA, Scientometrics, 27 : 19–38. RAMSDEN, P. (1994), Describing and explaining research productivity, Higher Education, 28 : 207–226. RESKIN, B. F. (1977), Scientific productivity and the reward structure of science, American Sociological Review, 42 : 491–504. ROSENTHAL, S. R., STRANGE, W. C. (2004), Evidence on the nature and sources of agglomeration economies, In: J. V. HENDERSON, J. F. THISSE (Eds), Handbook of Urban and Regional Economics, Volume 4, New York, North Holland. SAXENIAN, A. (1996), Regional Advantage. Culture and Competition in Silicon Valley and Route 128. Boston, Harvard University Press. SCHERER, F. M. (1980), Industrial Market Structure and Economic Performance, Houghton Mifflin, Boston. SCOTT, A. J. (Ed.) (2001), Global City-Regions. Oxford, Oxford University Press. SEGLEN, P. O., AKSNES, D. W. (2000), Scientific productivity and group size: a bibliometric analysis of Norwegian microbiological research, Scientometrics, 49 : 125–143.

112 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

SHINN, T. (1979), The French science faculty system, 1808–1914, Historical Studies in the Physical Sciences, 10. SHINN, T. (1982), Scientific disciplines and organizational specificity, In: N. ELIAS et al. (Eds), Scientific Establishment and Hierarchies, Sociology of Sciences Yearbook 6, Reidel, Dordrecht. VON TUNZELMANN, N., RANGA, M., MARTIN, B., GEUNA, A. (2003), The Effects of Size on Research Performance: A SPRU Review, Report prepared for the Office of Science and Technology, Department of Trade and Industry. WHITLEY, R. (1984), The Intellectual and Social Organization of the Sciences, Oxford University Press, Oxford, second edition, 2000. ZUCKER, L., DARBY, M., ARMSTRONG, J. (1998), Intellectual capital and the firm: The technology of geographically localized knowledge spillovers, Economic Inquiry, 36 : 65–86.

Scientometrics 63 (2005) 113 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Appendix A Loess plots of size vs. productivity indicators by research area at CNR Disaggregates for Figure 1

a) Size (T_RES) vs. Productivity Indicators b) Size (T_PERS) vs. Productivity Indicators (IPURES) (IPUPERS)

114 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

a) Size (T_RES) vs. Productivity Indicators b) Size (T_PERS) vs. Productivity Indicators (IPURES) (IPUPERS)

Scientometrics 63 (2005) 115 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Appendix B Correlations by research area at CNR Disaggregates for Tables 5 and 6

Correlation between size of institutes and indicators of scientific output and productivity a) MA1 Agriculture Variable T_PUB P_INTPUB INTPUB IPURES IPUPERS PUBRES T_RES 0.704** 0.228 0.646** –0.252 –0.253 –0.395 T_PERS 0.649** 0.237 0.596** –0.094 –0.419* –0.224 ADM 0.395 –0.092 0.230 –0.187 –0.398 –0.137 TECH 0.524** 0.260 0.510* 0.043 –0.429* –0.089 ORD_RES 0.489* 0.081 0.403 –0.275 –0.084 –0.349 SEN_RES 0.444* 0.114 0.392 –0.155 –0.339 –0.226 DIR_RES 0.508* 0.406* 0.605** 0.069 –0.147 –0.130 LABCOS 0.617** 0.257 0.586** –0.124 –0.424* –0.274 T_COS 0.663** 0.271 0.630** –0.084 –0.388 –0.241 RESFUN 0.676** 0.259 0.646** 0.066 –0.185 –0.081 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed). b) MA2 Environment and habitat, Geology and Mining Variable T_PUB P_INTPUB INTPUB IPURES IPUPERS PUBRES T_RES 0.849** –0.243 0.696** –0.423* –0.442* –0.485* T_PERS 0.796** –0.301 0.617** –0.489* –0.514** –0.509 ADM 0.684** –0.389* 0.449* –0.525** –0.523** –0.464* TECH 0.738** –0.314 0.560** –0.513** –0.549** –0.516** ORD_RES 0.751** –0.229 0.590** –0.362 –0.390* –0.436* SEN_RES 0.735** –0.176 0.650** –0.376 –0.391* –0.417* DIR_RES 0.635** –0.232 0.469* –0.330 –0.312 –0.345 LABCOS 0.810** –0.291 0.634** –0.475* –0.498** –0.492* T_COS 0.832** –0.267 0.682** –0.443* –0.485* –0.462* RESFUN 0.800** –0.195 0.712** –0.340 –0.415* –0.364 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed). c) MA3 Biotechnologies and molecular biology, Medicine and Biology Variable T_PUB P_INTPUB INTPUB IPURES IPUPERS PUBRES T_RES 0.757** –0.252 0.843** –0.296 –0.349 –0.188 T_PERS 0.735** –0.185 0.841** –0.233 –0.341 –0.148 ADM 0.818** –0.286 0.720** –0.124 –0.189 0.055 TECH 0.624** –0.099 0.771** –0.182 –0.327 –0.141 ORD_RES 0.652** –0.245 0.745** –0.339 –0.375 –0.242 SEN_RES 0.741** –0.268 0.767** –0.153 –0.215 –0.033 DIR_RES 0.509** –0.055 0.606** –0.186 –0.231 –0.148 LABCOS 0.734** –0.170 0.847** –0.220 –0.318 –0.140 T_COS 0.672** –0.086 0.831** –0.107 –0.196 –0.068 RESFUN 0.564** –0.030 0.732** –0.034 –0.107 –0.022 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed).

116 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

d) MA4 Chemistry Variable T_PUB P_INTPUB INTPUB IPURES IPUPERS PUBRES T_RES 0.551** 0.017 0.580** –0.634** –0.560** –0.625** T_PERS 0.544** –0.060 0.547** –0.644** –0.656** –0.619** ADM 0.298 –0.018 0.311 –0.374 –0.497** –0.353 TECH 0.436* –0.145 0.406* –0.535** –0.616** –0.495* ORD_RES 0.232 –0.237 0.155 –0.562** –0.466* –0.514** SEN_RES 0.541** 0.223 0.647** –0.391* –0.359 –0.415* DIR_RES 0.520** 0.103 0.574** –0.454* –0.439* –0.471* LABCOS 0.580** 0.000 0.608** –0.628** –0.632** –0.615** T_COS 0.548** –0.005 0.571** –0.605** –0.601** –0.595** RESFUN 0.410* –0.016 0.418* –0.482* –0.458* –0.481* * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed). e) MA5 Physics Variable T_PUB P_INTPUB INTPUB IPURES IPUPERS PUBRES T_RES 0.801* –0.465* 0.772** –0.402* –0.457* –0.229 T_PERS 0.772** –0.454* 0.743** –0.374* –0.523** –0.206 ADM 0.767** –0.445* 0.752** –0.280 –0.435* –0.091 TECH 0.669** –0.401* 0.642** –0.326 –0.546** –0.181 ORD_RES 0.598** –0.300 0.596** –0.386* –0.428* –0.266 SEN_RES 0.725** –0.459* 0.685** –0.322 –0.342 –0.158 DIR_RES 0.736** –0.438* 0.698** –0.270 –0.410* –0.107 LABCOS 0.799** –0.474* 0.765** –0.358 –0.494** –0.180 T_COS 0.803** –0.411* 0.788** –0.311 –0.461* –0.151 RESFUN 0.632** –0.159 0.668** –0.122 –0.275 –0.045 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed). f) MA6 Engineering and architecture, Innovation and Technology Variable T_PUB P_INTPUB INTPUB IPURES IPUPERS PUBRES T_RES 0.774** 0.043 0.705** –0.210 –0.230 –0.272 T_PERS 0.704** –0.082 0.590** –0.229 –0.260 –0.246 ADM 0.380* –0.184 0.249 –0.198 –0.239 –0.156 TECH 0.637** –0.132 0.512** –0.221 –0.253 –0.218 ORD_RES 0.635** 0.128 0.586** –0.255 –0.236 –0.343 SEN_RES 0.779** –0.100 0.681** –0.135 –0.185 –0.135 DIR_RES 0.562** 0.046 0.556** –0.026 –0.103 –0.095 LABCOS 0.730** –0.081 0.616** –0.215 –0.245 –0.233 T_COS 0.729** –0.060 0.618** –0.168 –0.207 –0.193 RESFUN 0.665** –0.022 0.568** –0.075 –0.128 –0.112 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed).

Scientometrics 63 (2005) 117 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Correlation between size of institutes and indicators of cost, impact factor, and market funds a) MA1 Agriculture Variable COPUB COPUBINT AVIM P_MARFUN T_RES 0.249 0.113 0.091 –0.305 T_PERS 0.391 0.231 –0.017 –0.500* ADM 0.350 0.314 –0.026 –0.454* TECH 0.397 0.231 –0.077 –0.515* ORD_RES –0.011 –0.014 0.148 0.050 SEN_RES 0.386 0.254 –0.034 –0.499* DIR_RES 0.303 0.040 –0.010 –0.428* LABCOS 0.467* 0.286 –0.067 –0.561** T_COS 0.456* 0.279 –0.059 –0.472* RESFUN 0.322 0.197 –0.020 –0.074 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed). b) MA2 Environment and habitat, Geology and Mining Variable COPUB COPUBINT AVIM P_MARFUN T_RES 0.481* 0.369 0.063 –0.089 T_PERS 0.554** 0.449* –0.022 –0.101 ADM 0.500** 0.488* –0.086 –0.111 TECH 0.604** 0.487* –0.078 –0.104 ORD_RES 0.399* 0.338 0.087 0.086 SEN_RES 0.444* 0.269 0.053 –0.237 DIR_RES 0.369 0.380 –0.051 –0.163 LABCOS 0.562** 0.443* –0.004 –0.125 T_COS 0.575** 0.415* 0.014 –0.003 RESFUN 0.549** 0.324 0.046 0.233 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed). c) MA3 Biotechnologies and molecular biology, Medicine and Biology Variable COPUB COPUBINT AVIM P_MARFUN T_RES 0.497** 0.380 –0.007 0.405* T_PERS 0.529** 0.382* –0.015 0.423* ADM 0.154 0.104 –0.079 0.110 TECH 0.562** 0.394* –0.007 0.447* ORD_RES 0.522** 0.428* –0.070 0.469* SEN_RES 0.269 0.200 –0.039 0.181 DIR_RES 0.429* 0.253 0.237 0.285 LABCOS 0.526** 0.370 0.006 0.439* T_COS 0.559** 0.389* –0.004 0.725** RESFUN 0.516** 0.357 –0.009 0.803** * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed).

118 Scientometrics 63 (2005) A. BONACCORSI, C. DARAIO: Size and agglomeration effects

d) MA4 Chemistry Variable COPUB COPUBINT AVIM P_MARFUN T_RES 0.463* 0.430* 0.035 –0.276 T_PERS 0.532** 0.530** 0.009 –0.246 ADM 0.267 0.248 0.094 –0.238 TECH 0.522** 0.563** –0.046 –0.140 ORD_RES 0.435* 0.500** –0.374 0.046 SEN_RES 0.239 0.133 0.366 –0.420* DIR_RES 0.382 0.340 0.163 –0.289 LABCOS 0.510** 0.489* 0.095 –0.307 T_COS 0.544** 0.535** –0.021 –0.183 RESFUN 0.565** 0.589** –0.299 0.138 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed). e) MA5 Physics Variable COPUB COPUBINT AVIM P_MARFUN T_RES 0.314 0.505** 0.029 –0.260 T_PERS 0.389* 0.575** 0.018 –0.244 ADM 0.205 0.360 0.104 –0.398* TECH 0.447* 0.614** –0.007 –0.182 ORD_RES 0.372 0.484** –0.040 –0.148 SEN_RES 0.171 0.368 0.077 –0.281 DIR_RES 0.275 0.457* 0.033 –0.217 LABCOS 0.352 0.551** 0.058 –0.260 T_COS 0.387* 0.547** –0.018 –0.103 RESFUN 0.387* 0.412* –0.205 0.316 * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed). f) MA6 Engineering and architecture, Innovation and Technology Variable COPUB COPUBINT AVIM P_MARFUN T_RES 0.005 –0.097 –0.074 0.028 T_PERS 0.053 0.003 –0.223 0.060 ADM –0.008 –0.002 –0.250 –0.230 TECH 0.084 0.063 –0.278 0.116 ORD_RES 0.120 –0.040 –0.108 0.104 SEN_RES –0.130 –0.126 0.038 –0.017 DIR_RES –0.094 –0.154 –0.171 –0.184 LABCOS 0.015 –0.019 –0.219 0.051 T_COS 0.019 –0.012 –0.253 0.184 RESFUN 0.025 0.002 –0.284 0.385* * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed).

Scientometrics 63 (2005) 119 A. BONACCORSI, C. DARAIO: Size and agglomeration effects

Correlation between GAI and scientific productivity indicators (IPURES, IPUPERS, PUB_PERS, PUB_RES, INTPUB) IPURES IPUPERS PUB_PERS PUB_RES INTPUB MA1 0.161 0.080 0.101 0.189 –0.048 MA2 –0.188 –0.160 –0.066 0.019 –0.177

MA3 I 0.354 0.342 0.424* 0.460* 0.256 A

MA4 G –0.270 –0.283 –0.285 –0.273 0.174 MA5 0.104 0.185 0.263 0.172 –0.037 MA6 –0.100 –0.142 –0.177 –0.184 0.424* * Pearson Correlation is significant at the 0.05 level (2-tailed). ** Pearson Correlation is significant at the 0.01 level (2-tailed).

120 Scientometrics 63 (2005)