<<

Delft University of Technology

The Cognitive Infrastructures of Markets Empirical Studies on the Role of Categories in Valuation and Competition, and a Formal Theory of Classification Systems Based on Lattices and Order Piazzai, Michele DOI 10.4233/uuid:25e95813-c9e9-4eae-a1e7-f24920fbe592 Publication date 2018 Document Version Final published version Citation (APA) Piazzai, M. (2018). The Cognitive Infrastructures of Markets: Empirical Studies on the Role of Categories in Valuation and Competition, and a Formal Theory of Classification Systems Based on Lattices and Order. https://doi.org/10.4233/uuid:25e95813-c9e9-4eae-a1e7-f24920fbe592 Important note To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology. For technical reasons the number of authors shown on this cover page is limited to a maximum of 10. The Cognitive Infrastructures of Markets Empirical Studies on the Role of Categories in Valuation and Competition, and a Formal Theory of Classification Systems Based on Lattices and Order

The Cognitive Infrastructures of Markets Empirical Studies on the Role of Categories in Valuation and Competition, and a Formal Theory of Classification Systems Based on Lattices and Order

Proefschrift

ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnicus prof.dr.ir. T.H.J.J. van der Hagen, voorzitter van het College voor Promoties, in het openbaar te verdedigen op donderdag 22 februari 2018 om 10:00 uur

door Michele Piazzai

Master of Arts in Arts & Culture, Erasmus Universiteit Rotterdam, Nederland, geboren te Avezzano, Italië. This dissertation has been approved by the promotors: Dr. A. Palmigiano, Prof. I. R. van de Poel, Prof. N. M. Wijnberg

Composition of the doctoral committee: Rector Magnicus, Chairman Prof. N. M. Wijnberg, University of Amsterdam Prof. I. R. van de Poel, Delft University of Technology Dr. A. Palmigiano, Delft University of Technology

Independent members: Prof. M. T. Hannan, Stanford University Prof. M. Ma, Sun Yat-sen University Prof. N. Doorn, Delft University of Technology Dr. W. Conradie, University of Johannesburg

Reserve member: Prof. M. J. van den Hoven, Delft University of Technology

Keywords: Categorization, organization theory, applied logic Design: Michele Piazzai, Delft University of Technology Cover: Standard Industrial Classication (background), concept lattice (back) Printed by: ProefschriftMaken || www.proefschriftmaken.nl

Funded by the Netherlands Organisation for Scientic Research, Vidi 016-138-314. Copyright © 2018 by Michele Piazzai. All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from the author. ISBN 978-90-90306-12-4

An electronic version of this dissertation is available at https://repository.tudelft.nl/. What one wishes to gain from one’s categories is a great deal of information about the environment while conserving nite resources as much as possible. Eleanor Rosch on the principle of cognitive economy

Contents

Summary xi Samenvatting xv Preface xix 1 Introduction1 1.1 Categories: Why Bother?...... 2 1.2 Main Themes...... 7 1.2.1 Cognitive Economy...... 7 1.2.2 Structure of Classication Systems...... 10 1.2.3 Category Dynamics...... 12 1.2.4 Logics for Categorization...... 14 1.3 References...... 16

I Empirical Studies 27 2 Categorization and Strategic Deterrence 29 2.1 Motivation...... 30 2.2 Theory and Hypotheses...... 33 2.2.1 The Geometric Structure of Markets...... 33 2.2.2 Complexity as a Category Property...... 37 2.3 Methodology...... 42 2.3.1 Empirical Setting...... 42 2.3.2 Sample and Variables...... 45 2.3.3 Estimation Procedure...... 52 2.4 Results...... 54 2.5 Discussion...... 60 2.6 References...... 63 3 Prototypes, Goals, and Cross-Classication 73 3.1 Motivation...... 74 3.2 Theory and Hypotheses...... 76 3.2.1 The Feature Space...... 76 3.2.2 Atypicality and its Consequences...... 80

vii viii Contents

3.2.3 Suitability for Multiple Goals...... 84 3.2.4 The Eect of Spanning in Dierent Systems...... 87 3.3 Methodology...... 89 3.3.1 Empirical Setting...... 89 3.3.2 Sample and Variables...... 91 3.3.3 Estimation Procedure...... 98 3.4 Results...... 99 3.5 Discussion...... 106 3.6 References...... 110

II Logical Formalizations 119 4 Classication Systems as Concept Lattices 121 4.1 Motivation...... 122 4.2 Preliminaries...... 126 4.2.1 Perfect Lattices and Birkho’s Theorem...... 126 4.2.2 Duality with RS-Polarities...... 127 4.2.3 RS-Frames and Models...... 131 4.2.4 Standard Translation...... 133 4.3 Application to Organization Theory...... 136 4.3.1 Categorization via RS-Semantics...... 136 4.3.2 Category Emergence...... 143 4.4 Discussion...... 145 4.5 References...... 147 5 Toward an Epistemic Logic of Categories 153 5.1 Motivation...... 154 5.2 Theoretical Foundations...... 155 5.2.1 Cognitive Perspectives on Categorization...... 155 5.2.2 Extant Formal Approaches...... 157 5.3 Building an Epistemic-Logical Language...... 159 5.3.1 Basic Logic and Intended Meaning...... 159 5.3.2 Interpretation in Enriched Formal Contexts...... 160 5.3.3 Introducing Common Knowledge...... 162 5.3.4 Hybrid Expansions of the Basic Language...... 163 5.4 Soundness and Completeness...... 163 5.4.1 Denition of I -Compatible Relations...... 163 5.4.2 Interpretation of C ...... 165 5.4.3 Soundness...... 168 5.4.4 Completeness...... 170 Contents ix

5.5 Proposed Formalizations...... 176 5.6 Discussion...... 180 5.7 References...... 182 6 Conclusions 187 6.1 Summary of Findings...... 188 6.2 Implications for the Themes...... 191 6.3 Limitations and Further Research...... 194 6.4 References...... 196 List of Figures 203 List of Tables 205 Index of Names 207 Acknowledgements 211 Curriculum Vitæ 215 List of Publications 217

Summary

his dissertation addresses the question of how the information en- coded by category labels is interpreted by agents in a market for T the purpose of decision-making. To this end, we rst examine the inuence of categorization on economic and strategic outcomes with two empirical studies, and then use the insights provided by these studies to develop a formal theory of classication systems. Consistently with Formal Concept Analysis (FCA), this theory builds on the fundamental mathemati- cal notions of lattices and order, and it is thus uniquely suited to yield an ontological perspective on category representations. As a result, we are much better equipped to understand how categories serve as the “cognitive infrastructures” of markets and aect economic activity. Chapter1 oers a concise overview of the extant research on categorization in cognitive psy- chology, economic sociology, and organization theory. We build extensively on this diverse literature during the course of our exposition. The rst part of this thesis includes our empirical studies. In Chapter2, we synthesize insights from industrial economics, strategic management, and organizational ecology to examine the eects of product proliferation strategies. Conceptualizing the market as a multidimensional (Lancastrian) space of product features, we argue that product categories guide rms’ strategic decisions by partitioning the space into subsets or regions. Prod- uct proliferation occurs when a rm bids to occupy a product category at the expense of competitors by saturating the corresponding region of space. Consistently with game-theoretic models of product competition in dierentiated markets, we predict proliferation to have a negative eect on the likelihood of rival product introductions in the targeted category; however, we also predict that this eect is weaker if the region of space to which the category maps is more complex (i.e., heterogeneous in terms of product features). Our analysis of rms’ patterns of new product introduc- tions in the US recording industry supports these hypotheses; in addition, it suggests that product proliferation eectively deters competitors who can alter their positioning in feature space, but those who are constrained to particular positions remain virtually unaected. In Chapter3, we turn to consumers’ perspective and examine how the cat- egorization of products according to dierent classication systems aects

xi xii Summary the attribution of value. Focusing on the distinction between categories based on prototypes and categories based on goals, we argue that these category labels of these two kinds map to structurally dierent regions of the feature space. Valuation requires consumers to infer the location of products from their labels, but because type- and goal-based categories have dierent internal structures, they enable dierent sorts of inferences. Building on this argument, we theorize that under particular conditions spanning type-based categories has a U-shaped eect on consumers’ eval- uations, whereas spanning goal-based categories has a negative eect. At the same time, we predict that spanning goal-based categories can mod- erate the U-shaped eect of spanning type-based categories by enabling consumers to make more precise inferences from fewer type-based labels. Our analysis of product ratings on a popular music website oers empirical support for these hypotheses. In the second part of this thesis, we develop a formal theory of cate- gorization that accounts for the key aspects highlighted by our empirical studies. In Chapter4, we introduce an order-theoretic account of classi- cation systems as RS-frames. These are algebraic structures based on RS-polarities, which we enrich with additional relations to interpret modal- ities. Consistently with FCA, we propose to interpret an RS-polarity as a database consisting of a set of objects (such as products or organizations in a market), a set of features, and an incidence relation linking objects with their features. All the possible categories whereby the objects and the features may be grouped arise as the Galois-stable sets of this polarity, just like formal concepts in FCA. An agent’s perception of the objects and their features, which can be unique, incomplete, or even mistaken, is modeled by a relation giving rise to a normal modal operator that expresses an agent’s beliefs about a category’s intensional and extensional meaning. The xed points of the iterations of belief modalities are used to model categories whose meaning is shared as they arise from social interaction. In Chapter5, we clarify how the order-theoretic perspective on concepts enabled by FCA complements the geometric perspective allowed by the theory of conceptual spaces. In addition to introducing a sound and com- plete epistemic-logical language, we rene the framework presented in the previous chapter both technically and conceptually: Technically, because we free its semantics from the restrictions imposed by the RS-conditions and generalize to more natural Kripke-style frames. This makes our for- malism better suited to represent formal contexts (i.e., databases) as they occur in real-world domains. Conceptually, because we enhance our theory of classication systems as concept lattices and propose formalizations for Summary xiii some of the most important theoretical constructs in the categorization lit- erature, including typicality, similarity, contrast, and leniency. In particular, we elaborate our interpretation of the xed-point construction introduced before by tying it directly to the notion of typicality. Possible extensions are discussed, especially with regard to dynamic updates. Chapter6 summarizes the main ndings of this dissertation, elucidates their implications for organizational research, identies key areas for im- provement, and presents promising directions for future study. Special consideration is given to the possibility of unifying FCA and conceptual spaces using the framework of correspondence theory. We conclude with a general reection on the role of logic in the social sciences.

Samenvatting

it proefschrift gaat in op de vraag hoe de informatie van catego- rielabels wordt geïnterpreteerd door marktactoren ten behoeve van D hun besluitvorming. Hiertoe onderzoeken we eerst de invloed van categorisatie op economische en strategische uitkomsten met twee empi- rische studies en gebruiken vervolgens de inzichten van deze studies om een formele theorie van classicatiesystemen te ontwikkelen. Overeen- komstig de aanpak van Formeleconceptanalyse (FCA) bouwt deze theorie voort op de fundamentele wiskundige concepten van tralies en orde en is daarom uiterst geschikt voor een ontologisch perspectief op categorie- representaties. Als gevolg hiervan zijn we veel beter in staat om te begrijpen hoe categorieën dienen als de “cognitieve infrastructuren” van markten en de economische activiteit beïnvloeden. Hoofdstuk1 biedt een beknopt overzicht van het bestaande onderzoek naar categorisatie in cognitieve psychologie, economische sociologie, en organisatietheorie. We bouwen uitgebreid voort op deze diverse literatuur in deze studie. Het eerste deel van dit proefschrift bevat onze empirische studies. In Hoofdstuk2 synthetiseren we inzichten uit de industriële economie, strategisch management, en de organisatie-ecologie om de eecten van productproliferatie-strategieën te onderzoeken. Als we ons de markt kun- nen voorstellen als een multidimensionale (Lancastriaanse) ruimte van producteigenschappen, stellen we dat productcategorieën de strategische beslissingen van bedrijven sturen door de ruimte te verdelen in deelver- zamelingen of regio’s. Productproliferatie treedt op wanneer een bedrijf poogt een productcategorie te bezetten ten koste van concurrenten door het bijbehorende deel van de ruimte te verzadigen. Consistent met spel- theoretische modellen van productconcurrentie op gedierentieerde mark- ten voorspellen we dat proliferatie een negatief eect zal hebben op de waarschijnlijkheid van concurrerende productintroducties in de betroen categorie; we voorspellen echter ook dat dit eect zwakker is in de de- len van de ruimte waar de categorizering meer complex is (d.w.z., meer heterogeen in termen van productkenmerken). Onze analyse van de pa- tronen van nieuwe productintroducties van bedrijven in de Amerikaanse muziekindustrie ondersteunt deze hypothesen; bovendien suggereert het dat productproliferatie concurrenten, die hun positionering in kenmerk-

xv xvi Samenvatting ruimte kunnen wijzigen, eectief afschrikt, maar weinig eect heeft op degenen die hun posities niet of moeilijk kunnen veranderen. In Hoofdstuk3 komt het consumentenperspectief centraal te staan en onderzoeken we hoe de indeling van producten volgens verschillende classicatiesystemen van invloed is op de bepaling van waarde. Door te focussen op het onderscheid tussen categorieën op basis van prototypen en categorieën op basis van doelen, stellen we dat deze categorielabels van deze twee soorten verwijzen naar structureel verschillende regio’s van de kenmerk-ruimte. Om waarde te bepalen moet de consument de locatie van producten van hun labels aeiden, maar omdat categorieën op basis van typen en doelgroepen verschillende interne structuren hebben, maken ze verschillende soorten gevolgtrekkingen mogelijk. Voortbouwend op dit argument theoretiseren we dat onder bepaalde omstandigheden type gebaseerde categorieën het hebben van meerdere categorielabels tegelijk een U-vormig eect op consumentenevaluaties heeft, terwijl het hebben van meerdere op doel gebaseerde categorieën een negatief eect heeft. Tegelijkertijd voorspellen we dat het hebben van meerdere labels van doel gebaseerde categorieën het U-vormige eect van het hebben van meerdere labels van type-categorieën kan modereren door consumenten in staat te stellen meer precieze conclusies te trekken op basis van de gecombineerde type gebaseerde labels. Onze analyse van productbeoordelingen op een populaire muziekwebsite ondersteunt deze hypothesen. In het tweede deel van dit proefschrift ontwikkelen we een formele theorie van categorisering die rekening houdt met de belangrijkste aspec- ten die worden benadrukt door onze empirische studies. In Hoofdstuk4 introduceren we een op ordetheorie gebaseerde beschrijving van classi- catiesystemen als RS-frames. Dit zijn algebraïsche structuren op basis van RS-polariteiten, die we verrijken met extra relaties om modaliteiten te inter- preteren. In overeenstemming met FCA stellen we voor om een RS-polariteit te interpreteren als een database die bestaat uit een reeks objecten (zoals producten of organisaties op een markt), een reeks kenmerken en een incidentie-relatie die objecten koppelt aan hun kenmerken. Alle mogelijke categorieën waarin de objecten en de kenmerken gegroepeerd kunnen worden, ontstaan als de Galois-stabiele verzamelingen van deze polariteit, net zoals formele concepten in FCA. De perceptie van een agent van de objecten en hun kenmerken, die uniek, onvolledig of zelfs fout kan zijn, wordt gemodelleerd door een relatie die aanleiding geeft tot een normale modale operator die de opvattingen van een agent over de intensionale en extensionale betekenis van een categorie uitdrukt. De dekpunten van de iteraties van de modaliteiten van geloof worden gebruikt om categorieën Samenvatting xvii te modelleren waarvan de betekenis wordt gedeeld als ze voortkomen uit sociale interactie. In Hoofdstuk5 verduidelijken we hoe het ordetheoretische perspec- tief op begrippen die door FCA mogelijk worden gemaakt, een aanvulling vormt op het geometrische perspectief dat door de theorie van conceptuele ruimten is toegestaan. Naast het introduceren van een correct en volledig epistemisch-logische taal, verjnen we het raamwerk dat in het vorige hoofdstuk werd gepresenteerd, zowel technisch als conceptueel: technisch gezien, omdat we de semantiek bevrijden van de beperkingen opgelegd door de RS-condities en generaliseren naar meer natuurlijke Kripke-achtige frames. Dit maakt ons formalisme beter geschikt om formele contexten (d.w.z., databases) weer te geven zoals deze in de werkelijkheid voorko- men. Conceptueel, omdat we onze theorie van classicatiesystemen als tralies van concepten verbeteren en formalisaties voorstellen voor enkele van de belangrijkste theoretische constructies in de categoriseringslite- ratuur, waaronder typicaliteit, gelijkenis, contrast, en toegevendheid. In het bijzonder werken we onze interpretatie van de eerder geïntroduceerde dekpunt-constructie uit door deze direct aan de notie van typicaliteit te kop- pelen. Mogelijke uitbreidingen worden besproken, vooral met betrekking tot dynamische updates. Hoofdstuk6 vat de belangrijkste bevindingen van dit proefschrift sa- men, licht hun implicaties voor organisatorisch onderzoek toe, identiceert belangrijke verbeterpunten, en suggereert veelbelovende richtingen voor toekomstig onderzoek. Speciale aandacht wordt geschonken aan de mo- gelijkheid om FCA en conceptuele ruimtes te verenigen met behulp van het algemene kader van correspondentietheorie. We sluiten af met een algemene reectie op de rol van logica in de sociale wetenschappen.

Preface

ummarizing my research into a short pitch to be deployed in elevator conversations that never seem to happen is not a task that puts me S at ease. I believe many academics dread this exercise, but it feels all the more frustrating in my case because this dissertation is atypical by any standard. I nd myself at a loss whenever I am requested to assign it to a particular category or domain of scholarship. The irony is not lost on me that its topic is precisely categorization, and I do not know what the reader will make of my abilities as a researcher upon learning that I am unable to classify my own doctoral thesis, but such is the case. To give a sense of why this task is dicult for me, I would like to borrow an insightful metaphor Peter Gärdenfors used in his preface to Conceptual Spaces: The Geometry of Thought (MIT Press, Cambridge, MA, 2000, p. ix):

While writing the text [of my book], I felt like a centaur, standing on four legs and waving two hands. The four legs are supported by four disciplines: philosophy, computer science, psychology, and linguistics (and there is a tail of neuroscience). Since these disciplines pull in dierent directions—in particular when it comes to methodological questions—there is a considerable risk that my centaur has ended up in a four-legged split. A consequence of this split is that I will satisfy no one. Philoso- phers will complain that my arguments are weak; psychologists will point to a wealth of evidence about concept formation that I have not accounted for; linguistics will indict me for glossing over the intricacies of language in my analysis of semantics; and computer scientists will ridicule me for not developing algorithms of the various processes that I describe. I plead guilty to all four charges. My aim is to unify ideas from dierent disciplines into a general theory of representation. This is a work within cognitive science and not one in philosophy, psychology, linguistics, or computer science. My ambition here is to present a coherent research program that others will nd attractive and use as a basis for more detailed investigations.

xix xx Preface

I hope to be forgiven for my hubris in borrowing this incipit, but it is indeed useful to think of this thesis as a creature that stands on four legs and uses two hands to engage with the object of its inquiry. Unlike Gärdenfors’ centaur, however, this creature can be pictured as having not one but two heads. I nd this picture more faithful and agreeable than any category label, if a little grotesque. In my case, the four legs represent the elds of sociology, economics, psychology, and mathematics. The combination of these research domains is far from haphazard: some of them co-occur quite regularly in the study of social phenomena, as in the case of economic sociology, social psychology, behavioral economics, and game theory. The four of them together, however, make a relatively uncommon sight. I hope to present a good case that their match is worthwhile. The two hands represent two scientic methodologies commonly used in the disciplines above to answer their respective research questions, i.e., statistical modeling and logical formalization. Because of the relative independence with which these are deployed toward a common objective, this dissertation consists of two parts. Finally, the two heads represent two separate but (I argue) mutually compatible ways of reasoning about categories. One of them views categories geometrically as the regions of a conceptual space; the other views them order-theoretically as the elements of a concept lattice. Each perspective is powerful enough on its own to yield an insightful account of categorization, but together they oer an unparalled glimpse into its mathematical nature. In light of all the elements incorporated in this thesis, my reluctance to apply a category label is perhaps slightly more understandable. Above all, I believe it would be restrictive to consider its scope limited to the eld of business: markets are but one of many settings where categories exert their inuence, and many studies cited in the following chapters concern people’s behavior in dierent contexts. The aim of the Applied Logic Group at the Delft University of Technology, in whose womb my dissertation took shape, is to weave together insights from the social and the exact sciences in order to shed light on the epistemic foundations of social behavior. My thesis is a step toward this general purpose: although it addresses questions related to economic decision-making, its implications are much broader in principle. My aim is not to “explain” categorization in markets in a way that appears satisfactory to sociologists, economists, psychologists, and mathematicians, but to show a novel, integrative, and rigorous way to study categorization and other social phenomena. My main ambition is to introduce scholars in sociology and manage- ment to a formalism developed by Bernhard Ganter and Rudolf Wille during Preface xxi the Nineties and widely known as Formal Concept Analysis (FCA). This is an algebraic method for representing categories or concepts that proved suitable for a wide range of interdisciplinary applications, including eco- nomics (Formal Concept Analysis: Foundations and Applications, Springer, Berlin/Heidelberg, Germany, 2005). Given the current enthusiasm among social scientists for machine-learning techniques, it is surprising that FCA remains obscure. I believe this is partly because social scientists overwhelm- ingly favor verbal theorizing: in fact, it has been pointed out repeatedly to me that publishing on journals in sociology and management requires the scientist to be a little bit of a novelist. Storytelling is crucial to making one’s ndings interesting to a journal’s readership. I think this embellishment would dignify scientic research were not natural language so unbearably ambiguous. Given the current state of our knowledge about social behavior, it would be preferable to set aside the “storylines” to better spend one’s cognitive resources on the formalization of causal relationships. There is undoubtedly a learning curve to using formal language, but the potential benets for social science are immense. Much like a line of code allows a machine to perform in seconds operations that would take months to accomplish manually, logical formulas can express relationships, conditions, and deductive steps that would be cumbersome if not impossi- ble to render verbally with the same level of accuracy. Moreover, like good code, formal language has the advantage of being succinct. By using it in theory construction we stand to gain not only expressive power but also mathematical beauty. There are also clear benets in terms of generality: as a prime example, FCA is agnostic to the nature of the objects to be sorted into categories. This dissertation usually assumes they are products, but the same framework can be used to reason about categories of services, patents, organizations, or even rm strategies, architectures, and routines. Given the amount of (informal) theories put forward by sociologists and management scientists on a yearly basis—a large number of which turn out to be too vague, context-dependent, inconsistent, circular, redundant, or not theories at all—the adoption of symbolic language and formal rules of inference appears long overdue. Fortunately, I am neither the sole nor by any means the most qualied researcher to draw attention to this problem. The idea that social science is at present too ambiguous for its own good was the premise of a grant awarded in 2013 by the Netherlands Organization for Scientic Research to Alessandra Palmigiano (a logician) and Nachoem Wijnberg (a management scientist), under whose joint supervision this dissertation was produced. The same concern had been expressed before by three eminent sociologists, xxii Preface

Michael Hannan, László Pólos, and Glenn Carroll, in Logics of Organiza- tion Theory: Audiences, Codes, and Ecologies (Princeton University Press, Princeton, NJ, 2007). Even earlier, bridging logic and the social sciences was among the objectives of the Center for Computer Science in Organization and Management, an interdisciplinary research venture started at the Uni- versity of Amsterdam in 1990. This dissertation owes much to these early attempts to introduce logic to an audience overly accustomed to verbal theory and does not necessarily realize its limitations. As if to close a circle, toward the end of my Ph.D. I had the privilege of reading the draft of a new book by Hannan, Pólos, and their colleagues, provisionally titled Concepts and Categories: Foundations for Sociological Analysis. Like this thesis builds on FCA, Hannan et al. build on Gärdenfors’ spatial approach to study concept formation and inference, oering a math- ematical framework to scholars interested in social categories. I nd most of their arguments to be compatible with those presented here, especially with regard to the lattice-based interpretation of classication systems. Hannan et al. model these systems as (upper) semilattices whereas we opt for complete lattices, which essentially amounts to adding a lower bound. They dene a category’s prototype as the region of conceptual space where the likelihood that agents recognize an object as a member of the category (or an instance of the concept) is maximal; following the same intuition, we dene it as the set of features that every agent attributes to the concept and knows to be attributed to the concept by every other agent. Both frameworks allow us to connect the cognitive-psychological notion of typicality to the sociological construct of taken-for-grantedness. Hannan et al. dene this as the extent to which the agents assume to share the same meaning for a given concept and thus need not observe each other’s categorization decisions to know if they agree with them. Though we do not formalize taken-for-grantedness here, the epistemic language we build on top of FCA is capable of describing it in similar terms. In addition, there are important complementarities between the two approaches. Hannan et al. focus on the cognitive mechanisms whereby an agent comes to believe that an object has particular features and hence constitutes an instance of a certain concept. To make this problem tractable, they restrict the agent’s consideration to a given “root” concept, which sets the boundaries of the cognitive domain (e.g., rock music), and to a cohort of concepts that are immediately subordinate to the root and represent the possible alternatives (hard rock, post-rock, etc.). In this thesis, we do not ask why agents believe some objects to be instances of particular concepts, but we examine the consequences of these beliefs for their entire Preface xxiii classication system, including the concepts subordinate to the focal cohort and those superordinate to the root. Therefore, while Hannan et al. examine concept formation and inference, we focus on knowledge organization. I can think of no better way to understand the formal nature of categorization than to try and unify these two perspectives. At present, the connection between conceptual spaces and FCA is purely intuitive. Its formalization is not easy to achieve because the two frame- works draw on separate branches of mathematics—geometry and algebra, respectively—and thus dier in some fundamental respects. One of them is that, at least in Ganter and Wille’s formulation, FCA does not allow us to encode the gradedness of category memberships. In this view, an object is considered either within or outside the category and partiality is impossible. By contrast, conceptual spaces are well-suited to model partial member- ships by exploiting the metric nature of space and the real-valued distance of the category members from a category’s prototype. Being tools of dis- crete mathematics, lattice-based methods such as FCA are less naturally equipped to account for this information. Nonetheless, I expect this divide will be bridged in time as there are ongoing eorts to merge FCA and metric spaces into a generalized theory of representation (e.g., Dusko Pavlovic, Quantitative Concept Analysis). This goes to show not only that research on categories and concepts pushes the limits of scientic knowledge, but also that major breakthroughs in this direction require a concerted eort by scholars engaged on dierent fronts. In light of these diculties, this dissertation may only be considered a building block. The objective of developing a formally and empirically faith- ful account of categorization in social domains is far from being attained. But from our vantage point, the horizon looks promising.

Michele Piazzai Delft, December 2017

1 Introduction

1 2 Introduction

1 1.1. Categories: Why Bother? Describing categories as cognitive devices that simplify decision-making is fundamentally correct, but it is a massive understatement. Categorization is the basic epistemic recourse that allows human beings to entertain whatever hope of survival, both individually and as a species. It is the reason why someone who experiences pain as a result of mishandling a knife can expect to suer the same pain from mishandling any other knife, or indeed, any other sharp edge. It is also the reason why one is usually able to open a door without necessarily having seen this particular door before, or to drive a new car after having learned how to drive another. While the capacity to update one’s behavior on the basis of past experience is what makes an individual capable of driving the same car, opening the same door, or handling the same knife dierently in the future, it is the nontrivial ability to recognize some relevant patterns in the perceived structure of the world [1] that enables one to redeploy the accumulated knowledge to novel situations and tasks. The human brain is hardwired to detect correlations in the characteris- tics of the objects about which it is called to make decisions. It is readily apparent to us that the objects’ attributes do not coincide with equal probabilities: “some pairs, triples, etc., are quite probable, appearing in combination sometimes with one, sometimes another attribute; others are rare; others logically cannot or empirically do not occur” [2, p. 253]. Categorization is the process whereby these regularities are acknowledged and the objects that display them sorted into sets. The consequence of this process is that some of the perceivable distinctions between objects, which are deemed irrelevant to the decision at hand, are eectively suppressed. One advantage of imposing such lters on the information coming in from the senses and to the brain is that sets of objects can be assigned a de- fault behavioral response, such as preference or aversion, which saves the agent cognitive eort in future decisions involving similar objects. Another advantage is that semantic identiers, or category labels, can be attached to the sets so as to allow multiple agents to communicate eciently. It is thanks to category labels, for instance, that airplane passengers can easily determine whether a meal is consistent with their dietary restrictions upon reading that it is lactose-free, vegetarian, or halal, and that they generally know what do at border control after reading that one line is reserved to local nationals and one is for other passport holders. Given that the need for coordination underlies most human activities, it is hardly surprising that categories dominate many aspects of social life [3]. We congregate in public places, seek full-time or part-time employment, Categories: Why Bother? 3 visit art museums, and in some cases, read peer-reviewed research and 1 perform replicable studies to earn doctoral degrees from technical uni- versities. The inuence of categories is inescapable: anyone who happens to forget what constitutes formal attire and tries to defend a doctoral thesis wearing a pirate costume [4] will be promptly reminded of this fact. Categorization is, by extension, at the heart of the institutions that form the very fabric of society. Deciding what makes a government, a marriage, a sustainable technology, or—moving to the realm of economics—a market as opposed to an organization [5], is essentially a classication problem. The very execution of economic transactions rests upon the expectations induced by category labels, not only because these allow buyers and sell- ers to carry out a codied sequence of actions that results in a legitimate transfer of property, but also because words like banking, freelancing, and retail underpin the ow of capital [6] and labor [7,8]. Yet categories regulate economic activity well beyond the eecting of transactions. It is because of category-induced expectations, for instance, that customers do not consider a Chinese restaurant inauthentic because its sta refuses to perform acupuncture, but they very well might if the restaurant serves enchiladas [cf. 9]. Likewise, it is because of consumer segmentation (a synonym of categorization) that companies like Maserati, Bugatti, and Lamborghini do not advertise their products to teenagers. Such default expectations are not necessarily correct: some teenagers can aord to buy luxury cars, but the generalization is nonetheless useful to optimize a rm’s use of resources. The existence of strategic groups, i.e., categories of rms that maintain a similar competitive positioning [10], is the reason why Maserati executives are likely to be well aware of which cars are sold by Lamborghini and Bugatti, where, when, and at what price, but may be relatively indierent to what companies like Dacia and Suzuki are up to [11, 12]. On the demand side, categories are the reason why fans of heavy metal may assume that other fans of heavy metal are worthy of social interaction whereas fans of techno are not [cf. 13]. It is in the attempt to exploit categorization processes that, as Bourdieu [14] reminds us, some people buy opera tickets even though they have no taste for the subject, but simply because they wish to be seen sitting beside people who do. Explaining how categories aect the functioning of markets is currently among the foremost objectives for students of organization theory [15]. Two streams of literature stand out for their signicant contributions to this research agenda. One originates from social psychology [16], and primarily concerns itself with the mechanisms whereby rms and their constituents use category labels to dene their own identity [17, 18]. The other is rooted 4 Introduction

1 in economic sociology and revolves around the notion of “categorical imper- ative.” In Zuckerman’s formulation [19], this refers to the pressure exerted on rms by their external stakeholders, like analysts, critics, investors, and customers, who expect conformity to their category denitions. This latter perspective is widely represented in the work of organizational ecologists [20], who seek to explain rms’ success in the market through the opposing forces of competition, which promotes dierentiation, and legitimation, which rewards consistency with established schemata [21]. As ecologists tend to place the locus of identity outside the organization itself [22], they prioritize the viewpoint of external audiences who control the material and symbolic resources rms must acquire in order to survive [e.g., 23]. The two perspectives can be regarded as complementary: indeed, many spectacular failures of products and organizations can be explained by the mismatch between what people on the inside believe they are doing and what those on the outside perceive [24]. Given specic research questions and empirical contexts, however, it can be reasonable to privilege one perspective over the other. For example, in the markets for creative goods it is often the case that producers’ claims to particular category memberships are irrelevant to consumers’ decisions, while the judgment of critics [25] and other consumers [13] is supremely important. To illustrate this point, most of us have favorite musicians, painters, or lm directors, and we would be able to allocate them to particular genres if requested to do so, but we are oblivious to how most of these artists would dene their own work. This is not necessarily the case in other contexts: for example, in high-technology industries, the information producers convey about themselves and their oerings is extremely relevant to investors [26]. Even in these settings the expectations of external audiences tend to be important—e.g., because investors are likely to consider what kind of patents the rms apply for [27]—but organizations are granted more leeway [28]. The potential to explain how products are aected by consumers’ de- fault expectations [29], how these expectations can be exploited to rms’ benet [30], and how rm managers can sway them strategically [31], is the reason why the study of categories has practical implications for business. Understanding the consequences of categorization helps researchers ad- dress the question of why some innovations succeed [32, 33] whereas others are misunderstood [34, 35]. At a more theoretical level, studying categories and their eects can shed fundamental insights on the mechanisms that drive the evolution of industries [6], such as audience members’ interaction [36, 37]. Further, examining competitive processes through the lens of cate- gorization can illuminate why new markets and submarkets emerge [38], Categories: Why Bother? 5 and why existing ones split, merge, dissolve, or become obsolete [39]. It can 1 also be useful to clarify why some clusters of products or services [e.g. 40] fail to meet the requirements that mark their transition into full-edged competitive arenas: in this sense, research on categories can help scholars explain the puzzle of “markets that weren’t” [41]. Such practical and theoretical considerations motivate the research presented in this thesis. We aim to attain a more complete understanding of the role played by categories in the ordering of markets by clarifying how the information encoded by category labels is decoded by agents for the purpose of decision-making. To this end, we proer two empirical studies and a formal theory. More specically, in Chapter2 we empirically analyze how the properties of categories aect the outcomes of rms’ competitive strategies. In Chapter3, we consider how consumers derive information from category labels and how this information is combined in the case of multiple category memberships. Switching to formal methods, in Chapter4 we argue that classication systems can be mathematically represented as RS-frames, and that agents’ consensus about the meaning of category labels is simultaneously determined by factual information, subjective perception, and social interaction. In Chapter5, we rene this order-theoretic representation by generalizing to Kripke-style frames, and present a sound and complete epistemic-logical language that can be used to accurately describe category labels and their meaning. Finally, in Chapter 6, we summarize our ndings, reect on their implications for organizational research, and suggest directions for further study. Attesting to the breadth of extant literature on categorization, our argu- ments build on previous research in a variety of domains. Figure 1.1 oers an indication of the interdisciplinary scope of this thesis: to produce this chart, the references of each chapter were assigned to particular subjects based on Clarivate’s 2016 Journal Citation Reports (JCR) classication.1 As the gure shows, there is always a signicant proportion of references from the core disciplines of sociology, management, and business, but Chapters 2 and3 draw more extensively from economic theory, Chapters3, and5 build more on cognitive psychology, and Chapters4 and5 are more strongly oriented toward mathematics and logic. Many additional disciplines are lumped into the other category, including statistics, linguistics, biology, philosophy, musicology, acoustics, engineering, computer science, and

1References belonging to multiple JCR subjects are listed once per subject. References for which no JCR subjects were available, such as unclassied journals, books, book chapters, conference proceedings, online sources, and unpublished manuscripts, were manually assigned to particular JCR subjects based on their keywords. 6 Introduction

1

Figure 1.1: Distribution of references by subject category

information science, among others. Notwithstanding this bibliographic variation, four overarching themes can be taken to represent the common threads that bind this dissertation together. These broadly relate to:

1. the power of categories to encode the features of objects in a partic- ular domain, which enables cognitively limited agents to overcome information asymmetries and make rational decisions [42, 43];

2. the arrangement of categories into partially ordered structures, or classication systems, which hinge on specic rules for categorization [44] and normally consist of multiple levels of abstraction [45];

3. the inherently dynamic nature of categories [46], classication sys- tems [6, 47], and category properties [48]; and

4. the amenability of categories to various kinds of formal representa- tion, e.g., logical [49, 50], geometric [51], and set-theoretic [52]. Main Themes 7

None of these themes applies exclusively to categorization in markets. In 1 fact, the very same notions are relevant to categorization in other domains, such as natural language semantics [53–55]. Precisely because of their generality, they are useful to highlight that categorization is not a context- specic phenomenon to be explained by certain theories in psychology, others in linguistics, and others yet in sociology or management, but a deep-seated cognitive mechanism that permeates every aspect of social behavior, including economic processes, and is governed by general rules whose mark is discernible regardless of context [2]. Before providing an overview of the four themes and outlining their relations to the individual chapters, it is worth clarifying my personal con- tribution to the research presented in Chapters2–5, which is co-authored with other members or collaborators of the Applied Logic Group at the Delft University of Technology. Chapters2 and3 are the extended versions of research papers where I am formally the lead author. Chapters4 and5 are the outcome of joint work by a group of co-authors in which there is no formal lead. My role in this team has been to coordinate the inputs from mathematics and social science, bridging the two worlds and articulating their connections. In particular, I directed the logicians’ formalizations so as to keep our theory of classication systems true to extant literature on categorization in sociology, management, and cognitive psychology.

1.2. Main Themes 1.2.1. Cognitive Economy In describing the contents of a ctitious Chinese encyclopedia fantastically titled Celestial Emporium of Benevolent Knowledge, Jorge Luis Borges [56, p. 103] reports a very peculiar and, by now, very well-known taxonomy:

Animals are divided into (a) those that belong to the Emperor, (b) embalmed ones, (c) those that are trained, (d) suckling pigs, (e) mermaids, (f) fabulous ones, (g) stray dogs, (h) those that are included in this classication, (i) those that tremble as if they were mad, (j) innumerable ones, (k) those drawn with a very ne camel’s hair brush, (j) others, (m) those that have just broken a ower vase, (n) those that resemble ies from a distance.

In the author’s intention, the list serves to show that any attempt at cate- gorizing the objects that populate the world is bound to be arbitrary and conjectural, because the world, for all its complexity and the cognitive limitations to which humans are subject, is unfathomable to mankind. The 8 Introduction

1 passage was later popularized by Foucault [57], who pinpointed exactly why it sounds so outlandish: not because of the categories themselves, which are clearly understandable even though some are imaginary, but because of the accompanying alphabetical sequence, which gives the illusion of coherence. The French author observed that the wondrous quality of the taxonomy resides chiey in the narrowness of the interstitial space that its inventor must have perceived between stray dogs and fabulous animals, suckling pigs and mermaids, to think that this list could help someone make sense of the animal kingdom. The classication system presented in the Celestial Emporium has been widely discussed by sociologists and literary critics, and almost as if it were real, it has been invoked with some regularity by the critics of structuralism to support their claim that Western and non-Western cultures tend to organize the world using radically dierent principles [58]. This proposition was found questionable by cognitive psychologists, particularly Eleanor Rosch and her colleagues [45, 59–61], who devoted substantial eort to uncovering scientic evidence of the universal, cross-cultural rules that guide the formation of categories. In her seminal writings, Rosch thusly commented the Celestial Emporium: “Conceptually, the most interesting aspect of this classication system is that it does not exist. Certain types of categorizations may appear in the imagination of poets, but they are never found in the practical or linguistic classes of organisms or of man-made objects used by any of the cultures of the world” [2, p. 27]. In Rosch’s view, two principles lie at the core of any categorization. The rst principle is that, as mentioned before, the objects that consti- tute a cognitive domain display a perceivable correlational structure. For example, given three attributes of products such as sweetness, sourness, and wireless connectivity, it is empirically the case that the former two at- tributes co-occur with each other more often than either of them co-occurs with the third. This implies that the categories used by dierent agents to sort products and organizations are not accidental: they are (idiosyn- cratic) constructions based on objective patterns. The second principle, termed cognitive economy, holds that the purpose of any categorization is to yield an accurate but parsimonious representation of the cognitive do- main. Therefore, the categories considered meaningful by decision-makers are merely a subset of all the possible categories that could be used to sort objects: namely, they are the ones that most closely capture the objects’ perceived correlational structure [52]. Clearly, the animal categories of the Celestial Emporium do not reect this optimization. They strike us as absurd precisely because they are not economical. Main Themes 9

In the past few years, researchers in strategy and organization theory 1 increasingly emphasized the subjectiveness of market categories [12, 62, 63]. There is merit to these observations, as the process of tting complex en- tities into neat cognitive schemata can give rise to diculties that must be arbitrarily resolved [64]. Organizations are certainly complex enough to make these inconsistencies conspicuous in some cases [e.g., 19, Figure 4]. However, the subjective component of categorization is sometimes exag- gerated to the point that one is left to wonder how can markets function at all: for example, strategy scholars recently proposed the notion of “innite dimensionality” [12, p. 66], arguing that, because innite distinctions can exist between objects [65], there is an innite number of ways in which the objects can be sorted by dierent agents. Leaving aside the question of whether the perceivable distinctions between objects are ever truly innite,2 it is reasonable to presume that agents’ need for coordination would encourage them to seek common ground through social interaction [13, 37]. In the absence of some minimal level of consensus about cate- gory denitions, it is hard to envision how category labels can be fruitfully used in communication. Like the Celestial Emporium taxonomy, the notion that innite dimensions can be used to distinguish two objects sounds improbable because it is uneconomical.

Relation to the chapters. Cognitive economy appears in Chapter2 as we discuss how rms view the market as a partitioned landscape in lieu of a continuous surface. The regions that comprise this landscape can be considered subsets of a feature space, inasmuch as they map to particular product characteristics [67], as well as subsets of a resource space, as they map to particular consumer preferences [68, 69]. In both cases, their bound- aries denote variations relevant to rms’ strategic decisions. In Chapter 3, cognitive economy is addressed with regard to consumers’ evaluations: here, we study how category labels underpin the attribution of worth and examine how audience members economically combine information from multiple categories [70–73]. In Chapter4, the theme rises to prominence as we consider how meaningful categories stand out from the much broader set of possible categories, depending on the agents’ perceptions of objects and features as well as social interaction. In Chapter5, the theme is further exemplied by our eorts to generalize our formal theory by relaxing some of its technical restrictions and to build an epistemic-logical language that accommodates dierent perspectives on categorization.

2Psychological research suggests they are not [66]. In fact, they tend to be quite limited because humans have nite cognitive resources at their disposal. 10 Introduction

1 1.2.2. Structure of Classification Systems Beyond the reason pointed out by Foucault [57], there are two factors that further contribute to the awkwardness of the Celestial Emporium. One of these is that the categories it includes are likely to overlap though the classication system does not seem to be hierarchically ordered [74]. In other words, it appears devoid of vertical structure. On the contrary, the alphabetical sequence reinforces the impression that relatively generic and relatively specic categories belong to exactly the same level of abstraction. For example, animals that belong to the Emperor is quite specic, but animals that are included in this classification is exceedingly broad, espe- cially because the category others extends the system so as to encompass all animals. Real taxonomies [cf. 45] are not normally structured in this confusing fashion: instead, they consist of multiple levels of abstraction and hence distinguish subordinate categories from superordinate ones [75]. The relations between categories located at dierent levels of the hierarchy are governed by set-theoretic inclusion [52, 76]: because coherently apply- ing this rule to organize knowledge is one of the signs of mature reasoning [77], listing subordinate and superordinate categories in a way that ignores their obvious vertical relations is bizarre. The hierarchical arrangement of classication systems is extremely relevant to the study of categories in organizational contexts. Many of the systems commonly examined by organization theorists, such as industry codes [19], art genres [78], and patent classes [35], have an explicit vertical ordering. Much like cognitive psychologists, organization scholars face the question of how to select the most important level of abstraction for the decisions that agents (managers, analysts, investors, consumers) are required to make in a given situation. In most cases, the categories that are subordinate and superordinate to whatever level of abstraction is chosen by researchers are empirically disregarded. For example, many studies of category spanning in the creative industries use genres as the level of analysis [21, 23, 79, 80] but do not take into account that these categories can be genealogically linked [81].3 As a result, researchers tend to overlook the fact that agents in a market can derive a great deal of information from vertical structure [27]. Arguably, strategy scholars have been much more

3In recent years, it has become standard practice in empirical studies to account for the similarity of spanned categories by way of co-occurrences [e.g., 82]. This partly addresses the problem because hierarchically linked categories tend to co-occur more often; however, some categories rarely co-occur even though they are clearly related, as in the case of West Coast rap and East Coast rap music. The inclusion of two or more categories’ into the same superordinate category is thus imperfectly captured by similarity. Main Themes 11 mindful of the vertical dimension of classication systems: it has been 1 acknowledged since the work of Richard Rumelt [83, 84], for instance, that rms diversifying across related vs. unrelated market segments eectively engage in two separate strategies. The nal reason why the Celestial Emporium “taxonomy” appears inco- herent is that its categories have radically dierent internal (as opposed to external or vertical) structures. Some of them, like suckling pigs and mermaids, hinge on the same sorting rules that drive the distinction of real biological taxa, that is, family resemblance [61]; others, like embalmed animals, emphasize features that have little to do with biology but are consistent with a causal model of categorization [85, 86]; others yet, like an- imals that have just broken a flower vase, seem to be constructed entirely ad hoc [87]. While it is relatively common for people to use categories with dierent internal structures to organize cognitive domains [88], these cate- gories are rarely perceived to be part of a single classication system, and indeed, the act of mentally switching from one system to another tends to be evident in subjects engaged in experimental assignments [89]. Objects that belong to multiple categories within the same system can be confusing to the audience because multiple labels give rise to uncertainty when they encode conicting information [43, 90], but category labels that belong to dierent systems do not necessarily engender confusion [72, 88, 91]. For ex- ample, it makes perfect sense for a car to be considered a sedan according to a system based on family resemblance, a car with a stick shift according to a causal model, and a car reliable for long-distance travel according to an ad hoc or goal-based perspective [cf. 92]. Agents in a market can take advantage of dierent categorizing rules to more fully encode the perceived correlational structure of products and rms [62]. In some cases, the objects’ category memberships in one classication system allow agents to disambiguate the inconsistencies arising from multiple memberships in another [63]. Although cognitive psychologists have long acknowledged that considering family resemblance to be the sole relevant criterion for categorization is grossly inadequate [93], organization scholars have largely privileged this perspective over the past two decades [15]. This is partly because classication systems that are highly institutionalized and thus “sociologically real” [22, p. 478] tend to be grounded on this rationale. Empirical research has only recently begun to address the limitations of this perspective [63, 94, 95]. This dissertation aims to contribute to such an “ontological turn” [96] by investigating how agents in a market leverage the dierent external or internal structures of categories in order to make better decisions. 12 Introduction

1 Relation to the chapters. In Chapter2, we focus on a classication system based on family resemblance and show that its verticality has direct implica- tions for the eects of rms’ competitive strategies. In particular, we use the hierarchical relations between superordinate and subordinate categories as an indicator of the complexity [cf. 97] of the regions of feature space to which the superordinate categories correspond. In Chapter3, we address the consequences of cross-classication [98] by examining the eects of category spanning when the categories have similar or dierent internal structures. Focusing on the distinction between categories based on family resemblance [61] and categories based on consumers’ goals [91, 92], we argue that the two classication systems radically dier in terms of the information they allow evaluators to derive [44]. In Chapter4, we present the order-theoretic foundations of a formal theory of classication systems that is capable of accommodating both kinds of internal structure, as well as categories generated by a causal model [85, 86]. In Chapter5, we rene this mathematical framework and demonstrate its versatility by formalizing the psychological notion of typicality.

1.2.3. Category Dynamics Another recent trend in organizational research concerns the study of how classication systems mutate over time [15]. While marketing scholars have long acknowledged that sociocognitive dynamics of categorization fuel the evolution of industries [36], organization theorists only lately begun to ask programmatic questions about how new market categories emerge [38], acquire legitimacy [99–101], become associated with particular evaluation criteria [102, 103], and consequently endure as valuable tools for sensemak- ing [46]. Of even more recent origin is the systematic study of processes whereby existing categories lose their explanatory power [104]—and some- times their legitimacy [e.g., 105]—which can forewarn of their dissolution [39, 96]. These eorts have been instrumental in pushing organizational research beyond the naive conception of categories as static schemata, which is especially simplistic in the case of innovation-driven industries. As a result of this shift, organization theorists begun to examine market categories as having a lifecycle of their own [106]. Adopting categories as the unit of analysis, as opposed to the objects they encompass, is arguably necessary to explain how the information encoded by category labels is interpreted by agents for the purpose of decision-making [107]. It can also be useful to identify which changes in a particular environment can foretell changes in the informational content of the labels [108]. Organizational ecologists have been especially prolic in Main Themes 13 their pursuit of this research agenda, as many of their studies were geared 1 toward the reduction of categories to fundamental properties that correlate with meanings but can be measured empirically and generalized across contexts [109]. Some of these properties have become well-established theoretical constructs in the organizational literature [15], as in the case of category leniency [26] and contrast [110, 111], which relate to the exibility of category boundaries, or category similarity [7], which rather refers to the distance between category labels in a metric space where the axes or dimensions represent encoded information [82]. If changes in the meaning of categories are precursory to changes in classication systems, and thus deserving of researchers’ attention, longi- tudinal variation in the properties of categories is all the more important to consider because it can anticipate change in their meanings. Empiri- cal studies that point to the relevance of such cascading eects already exists in the organizational literature [48, 111–113], but cognate disciplines appear to have lagged behind. In strategy, for instance, empirical studies concerned with category dynamics [e.g., 114] remain few and far between, and they are scarcer yet in innovation management, even though the so- ciocognitive aspects of categorization are known to be crucial determinants of products’ convergence around a dominant design [115, 116]. Even more surprisingly, category dynamics are conspicuously absent in industrial eco- nomics, although previous research in this eld resorted to categories to better analyze competitive interactions [117]. This dissertation advocates for more thorough consideration of category dynamics in the answering of questions outside the traditional scope of organization theory. In our studies, we devote special attention to their time-variant properties and to the sociocognitive mechanisms that trigger their emergence.

Relation to the chapters. Category dynamics gure prominently in Chap- ter2 as we analyze how a distinctly changeable aspect of product categories, namely their complexity [cf. 97], aects the consequences of product pro- liferation. Building on extant research in strategy, industrial economics, and organization theory, we relate the complexity of categories to the com- plexity of the underlying region of the feature space and suggest possible connections with the study of complexity in organizational evolution [118]. The theme moves to the background in Chapter3, where category properties like similarity and contrast appear as controls in our analysis of category spanning; however, it returns to the forefront in Chapter4 as we develop a formal account of category emergence through social interaction. We argue that categories arise en masse from a set of objects with certain 14 Introduction

1 features and a set of agents who interact with each other but may have an idiosyncratic perception of objects and features. The theme remains dominant in Chapter5, where we use our mathematical machinery to for- malize well-known category properties, including contrast, leniency, and similarity. The time-variant nature of these properties is accommodated by tying their denitions to objects and features as well as the agents’ subjective perceptions thereof, all of which are subject to change.

1.2.4. Logics for Categorization The idea that is perhaps most vehemently emphasized throughout this dissertation is that category representations are formally tractable. As a matter of fact, half of the research presented in the following chapters is aimed at developing a suitable symbolic language. Although the use of formal theory is not new to sociological research [119], nor to the study of categories in particular [20], there is much to be gained from a more compre- hensive deployment of formal methods as the eld continues to grow and the scope of existing theories expands. Among the methods that have been successfully deployed in organization theory [120], formal logic occupies a prominent spot [20, 121–127]. One of the reasons for this primacy is that logical formalization allows organization scholars to determine whether views that are commonly understood as competing can actually be recon- ciled [128]. In addition to the obvious potential for theory advancement, this has the desirable eect of shattering the barriers between dierent “camps” and preventing the eld’s compartmentalization. Another reason is that formal logic strips organizational theories of the ambiguity inherent to natural language and makes them accessible to scrutiny, evaluation, and repair [129, 130]. The quality and explanatory thrust of the theories can be much improved by this exercise. Scholars who promote the use of logical methods in organizational research tend to emphasize the advantages they oer in terms of repli- cability, comparability, and generalizability of insights. Indeed, as noted by Hannan [131, p. 147] it can be “a humbling experience” to realize how much of one’s arguments depend on tacit assumptions once they are even partly formalized. Organizational research can immensely benet from the meticulousness imposed by logic, as researchers’ quest for precision in theory construction is fundamentally hampered by the equivocacies of informal language. Embracing methods of inquiry that forcefully rid causal statements of their vagueness can help propelling scholarship forward. “It can be hard to abide by whatever these formal, logical, or methodological standards demand. Yet in practice, they are what keep the theory under Main Themes 15 control. Perhaps counterintuitively, by establishing limits they are also 1 what allow for the creative development of new ideas” [132, p. 119]. It is important to remark, however, that the perks of cross-fertilization between logic and organization theory are not unidirectional. Logicians can also benet from using the messy world of organizations as a testing ground for their formalisms: in fact, the peculiar challenges posed by the inexactness of social behavior [126, 133, 134] can lead to ndings of mathematical import and pave the way to new, unconventional applications. Two formal approaches are considered especially germane to the study of categories and are hence discussed in this thesis. The rst one builds on Gärdenfors’ notion of conceptual spaces [51, 55], a method for representing cognitive domains based on geometric distance. In recent years, sociol- ogists have begun to explore the potential of conceptual spaces to yield an intuitive depiction of markets [67, 82, 108, 135–137], and there is little doubt that this line of research will come to full bloom in the future. The second framework builds on Birkho’s representation theorem [138] and is widely known in mathematics [e.g., 74] as Formal Concept Analysis. First developed by Ganter and Wille [139, 140], this method draws from order theory and characterizes categories as nodes of a concept lattice. Although this approach is uniquely suited to describe the ontological nature of clas- sication systems [96], it is still relatively foreign to sociologists. There is ground to believe that this application is fruitful, however, because such an order-theoretic interpretation of categories can explain aspects of agents’ reasoning that are less naturally captured by geometric arguments. One of the objectives of this dissertation is to stimulate appreciation for this valuable method in organizational research, not as an alternative but rather a complement to conceptual spaces.

Relation to the chapters. In the empirical part of this thesis, the theme of formal representation is expressed by numerous (explicit and implicit) references to conceptual spaces. In particular, Chapter2 explores the link between this framework and other geometric models familiar to sociol- ogists [68] and industrial economists [141]. Chapter3 relies on the same geometric framework to clarify how agents derive and combine information from categories that belong to dierent classication systems. Though the arguments presented in these two chapters are not (yet) formalized, they lend themselves well to logical reconstruction. The non-empirical part of this thesis is devoted to the application of Formal Concept Analysis to the research on categories in the social sciences. More specically, in Chapter 4 we dene the lattice-based structures underpinning our formalism and 16 Introduction

1 introduce their associated RS-semantics [142], In Chapter5, we generalize to more natural Kripke-style semantics and presents a sound and complete epistemic-logical language to be used in formalizations.

1.3. References [1] W. R. Garner, The Processing of Information and Structure (Lawrence Erlbaum Associates, Hillsdale, NJ, 1974). [2] E. H. Rosch, Principles of categorization, in Cognition and Catego- rization, edited by E. H. Rosch and B. B. Lloyd (Lawrence Erlbaum Associates, Hillsdale, NJ, 1978) pp. 27–48. [3] G. C. Bowker and S. L. Star, Sorting Things Out: Classication and its Consequences (MIT Press, Cambridge, MA, 1999). [4] NL Times, TU Delft bans Pastafarian student from attending doctorate ceremony in pirate suit, https://nltimes.nl/2017/11/17/tu-delft-bans- pastafarian-student-attending-doctorate-ceremony-pirate-suit (2017), accessed 20 November 2017. [5] H. A. Simon, Organizations and markets, Journal of Economic Perspec- tives 5, 25 (1991). [6] M. Ruef and K. Patterson, Credit and classication: The impact of industry boundaries in nineteenth-century America, Administrative Science Quarterly 54, 486 (2009). [7] M. D. Leung, Dilettante or Renaissance person? How the order of job experiences aects hiring in an external labor market, American Sociological Review 79, 136 (2014). [8] J. Merluzzi and D. J. Phillips, The specialist discount: Negative returns for MBAs with focused proles in investment banking, Administrative Science Quarterly 61, 87 (2016). [9] B. Kovács, G. R. Carroll, and D. W. Lehman, Authenticity and consumer value ratings: Empirical tests from the restaurant domain, Organization Science 25, 458 (2013). [10] M. Peteraf and M. Shanley, Getting to know you: A theory of strategic group identity, Strategic Management Journal 18, 165 (1997). [11] J. F. Porac and H. Thomas, Taxonomic mental models in competitor denition, Academy of Management Review 15, 224 (1990). [12] G. Cattani, J. F. Porac, and H. Thomas, Categories and competition, Strategic Management Journal 38, 64 (2017). [13] N. P. Mark, Birds of a feather sing together, Social Forces 77, 453 (1998). [14] P. Bourdieu, Distinction: A Social Critique of the Judgment of Taste (Harvard University Press, Cambridge, MA, 1984). References 17

[15] J.-P. Vergne and T. Wry, Categorizing categorization research: Review, 1 integration, and future directions, Journal of Management Studies 51, 56 (2014). [16] M. A. Hogg and D. Abrams, Social Identications: A Social Psychology of Intergroup Relations and Group Processes (Routledge, New York, NY, 1988). [17] M. A. Glynn, When cymbals become symbols: Conict over organiza- tional identity within a symphony orchestra, Organization Science 11, 285 (2000). [18] M. A. Hogg and D. I. Terry, Social identity and self-categorization pro- cesses in organizational contexts, Academy of Management Review 25, 121 (2000). [19] E. W. Zuckerman, The categorical imperative: Securities analysts and the illegitimacy discount, American Journal of Sociology 104, 1398 (1999). [20] M. T. Hannan, L. Pólos, and G. R. Carroll, Logics of Organization Theory: Audiences, Codes, and Ecologies (Princeton University Press, Princeton, NJ, 2007). [21] G. Hsu, G. Negro, and F. Perretti, Hybrids in Hollywood: A study of the production and performance of genre spanning lms, Industrial and Corporate Change 21, 1427 (2012). [22] G. Hsu and M. T. Hannan, Identities, genres, and organizational forms, Organization Science 16, 474 (2005). [23] G. Hsu, Jacks of all trades and masters of none: Audiences’ reactions to spanning genres in feature lm production, Administrative Science Quarterly 51, 420 (2006). [24] B. Kuijken, M. A. A. M. Leenders, N. M. Wijnberg, and G. Gemser, The producer-consumer classication gap and its eects on music festival success, European Journal of Marketing 50, 1726 (2016). [25] G. Hsu, P. W. Roberts, and A. Swaminathan, Evaluative schemas and the mediating role of critics, Organization Science 23, 83 (2012). [26] E. G. Pontikes, Two sides of the same coin: How ambiguous classica- tion aects multiple audiences’ evaluations, Administrative Science Quarterly 57, 81 (2012). [27] T. Wry and M. Lounsbury, Contextualizing the categorical imperative: Category linkages, technology focus, and resource acquisition in nan- otechnology entrepreneurship, Journal of Business Venturing 28, 117 (2013). [28] N. Granqvist, S. Grodal, and J. L. Woolley, Hedging your bets: Ex- plaining executives’ market labeling strategies in nanotechnology, 18 Introduction

1 Organization Science 24, 395 (2013). [29] G. Hsu, M. T. Hannan, and Ö. Koçak, Multiple category memberships in markets: An integrative theory and two empirical tests, American Sociological Review 74, 150 (2009). [30] G. Hsu and S. Grodal, Category taken-for-grantedness as a strate- gic opportunity: The case of light cigarettes, 1964 to 1993, American Sociological Review 80, 28 (2015). [31] E. G. Pontikes and R. Kim, Strategic categorization, in From Categories to Categorization: Studies in Sociology, Organizations, and Strategy at the Crossroads, Research in the Sociology of Organizations, Vol. 51, edited by R. Durand, N. Granqvist, and A. Tyllström (Emerald Group, Bingley, United Kingdom, 2017) pp. 71–111. [32] S. V. Sgourev and N. Althuizen, “Notable” or “not able”: When are acts of inconsistency rewarded? American Sociological Review 79, 282 (2014). [33] B. Kuijken, G. Gemser, and N. M. Wijnberg, Categorization and will- ingness to pay for new products: The role of category cues as value anchors, Journal of Product Innovation Management, 34, 757 (2017). [34] N. M. Wijnberg, Classication systems and selection systems: The risks of radical innovation and category spanning, Scandinavian Journal of Management 27, 297 (2011). [35] J.-P. Ferguson and G. Carnabuci, Risky recombinations: Institutional gatekeeping in the innovation process, Organization Science 28, 133 (2017). [36] J. A. Rosa, J. F. Porac, J. Runser-Spanjol, and M. S. Saxon, Sociocognitive dynamics in a product market, Journal of Marketing 63, 64 (1999). [37] Ö. Koçak, M. T. Hannan, and G. Hsu, Emergence of market orders: Audience interaction and vanguard inuence, Organization Studies 35, 765 (2014). [38] R. Durand and M. Khaire, Where do market categories come from and how? Distinguishing category creation from category emergence, Journal of Management 43, 87 (2017). [39] M. T. Kennedy, Y.-C. Lo, and M. Lounsbury, Category currency: The changing value of conformity as a function of ongoing meaning con- struction, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 369–397. [40] D. G. McKendrick and G. R. Carroll, On the genesis of organizational forms: Evidence from the market for disk arrays, Organization Science 12, 661 (2001). References 19

[41] C. Navis, G. Fisher, R. Raaelli, M. A. Glynn, and L. Watkiss, The market 1 that wasn’t: The non-emergence of the online grocery category, in Proceedings of the New Frontiers in Management and Organizational Cognition Conference, https://eprints.maynoothuniversity.ie/4055/ (2012), accessed April 6, 2017. [42] J. R. Anderson, The adaptive nature of human categorization, Psycho- logical Review 98, 409 (1991). [43] E. Konovalova and G. Le Mens, Feature inference with uncertain cat- egorization: Re-assessing Anderson’s rational model, Psychonomic Bulletin & Review, in press (2017). [44] L. W. Barsalou, Ideals, central tendency, and frequency of instanti- ation as determinants of graded structure in categories, Journal of Experimental Psychology: Learning, Memory, and Cognition 11, 629 (1985). [45] E. H. Rosch, C. B. Mervis, W. D. Gray, D. M. Johnson, and P. Boyes-Braem, Basic objects in natural categories, Cognitive Psychology 8, 382 (1976). [46] E. Y. Rhee, J. Y. Lo, M. T. Kennedy, and P. C. Fiss, Things that last? Category creation, imprinting, and durability, in From Categories to Categorization: Studies in Sociology, Organizations, and Strategy at the Crossroads, Research in the Sociology of Organizations, Vol. 51, edited by R. Durand, N. Granqvist, and A. Tyllström (Emerald Group, Bingley, United Kingdom, 2017) pp. 295–325. [47] M. Lounsbury and H. Rao, Sources of durability and change in market classications: A study of the reconstitution of product categories in the American mutual fund industry, 1944–1985, Social Forces 82, 969 (2004). [48] G. Negro, M. T. Hannan, and H. Rao, Category reinterpretation and de- fection: Modernism and tradition in Italian winemaking, Organization Science 22, 1449 (2011). [49] L. E. Bourne, Typicality eects in logically dened categories, Memory & Cognition 10, 3 (1982). [50] J. Feldman, Minimization of Boolean complexity in human concept learning, Nature 407, 630 (2000). [51] P. Gärdenfors, Conceptual Spaces: The Geometry of Thought (MIT Press, Cambridge, MA, 2000). [52] A. Tversky, Features of similarity, Psychological Review 84, 327 (1977). [53] W. Croft, Syntactic Categories and Grammatical Relations: The Cogni- tive Organization of Information (University of Chicago Press, Chicago, IL, 1991). [54] D. Widdows, Geometry and Meaning (CSLI Publications, Stanford, CA, 20 Introduction

1 2004). [55] P. Gärdenfors, The Geometry of Meaning: Semantics Based on Concep- tual Spaces (MIT Press, Cambridge, MA, 2014). [56] J. L. Borges, Other Inquisitions, 1937–1952 (University of Texas Press, Austin, TX, 1964). [57] M. Foucault, The Order of Things: An Archaeology of the Human Sci- ences (Pantheon Books, New York, NY, 1970). [58] K. Windschuttle, The Killing of History: How a Discipline is Being Mur- dered by Literary Critics and Social Theorists (Free Press, New York, NY, 1997). [59] E. H. Rosch, Natural categories, Cognitive Psychology 4, 328 (1973). [60] E. H. Rosch, Cognitive reference points, Cognitive Psychology 7, 532 (1975). [61] E. H. Rosch and C. B. Mervis, Family resemblances: Studies in the internal structure of categories, Cognitive Psychology 7, 573 (1975). [62] R. Durand and L. Paolella, Category stretching: Reorienting research on categories in strategy, entrepreneurship, and organization theory, Journal of Management Studies 50, 1100 (2013). [63] L. Paolella and R. Durand, Category spanning, evaluation, and perfor- mance: Revised theory and test on the corporate law market, Academy of Management Journal 59, 330 (2016). [64] G. L. Murphy and D. L. Medin, The role of theories in conceptual coher- ence, Psychological Review 92, 289 (1985). [65] E. Durkheim, Rules of Sociological Method (Macmillan, New York, NY, 1982 [1895]). [66] S. Verheyen, E. Ameel, and G. Storms, Determining the dimensionality in spatial representations of semantic concepts, Behavior Research Methods 39, 427 (2007). [67] A. Goldberg, M. T. Hannan, and B. Kovács, What does it mean to span cultural boundaries? Variety and atypicality in cultural consumption, American Sociological Review 81, 215 (2016). [68] G. L. Péli and B. Nooteboom, Market partitioning and the geometry of the resource space, American Journal of Sociology 104, 1132 (1999). [69] A. van Witteloostuijn and C. Boone, A resource-based theory of market structure and organizational form, Academy of Management Review 31, 409 (2006). [70] J. A. Hampton, Inheritance of attributes in natural concept conjunctions, Memory & Cognition 15, 55 (1987). [71] J. A. Hampton, Conceptual combination, in Knowledge, Concepts, and Categories, edited by K. Lambert and D. Shanks (MIT Press, Cambridge, References 21

MA, 1997) pp. 133–159. 1 [72] G. L. Murphy and B. H. Ross, Use of single or multiple categories in category-based induction, in Inductive Reasoning: Experimental, De- velopmental, and Computational Approaches, edited by A. Feeney and E. Heit (Cambridge University Press, New York, NY, 2007) pp. 205–225. [73] G. L. Murphy and B. H. Ross, Uncertainty in category-based induction: When do people integrate across categories? Journal of Experimental Psychology: Learning, Memory, and Cognition 36, 263 (2010). [74] B. A. Davey and H. A. Priestley, Introduction to Lattices and Order (Cambridge University Press, New York, NY, 2002). [75] B. Tversky and K. Hemenway, Objects, parts, and categories, Journal of Experimental Psychology: General 113, 169 (1984). [76] J. A. Hampton, Overextension of conjunctive concepts: Evidence for a unitary model of concept typicality and class inclusion, Journal of Experimental Psychology: Learning, Memory, and Cognition 14, 12 (1988). [77] V. F. Reyna, Class inclusion, the conjunction fallacy, and other cognitive illusions, Developmental Review 11, 317 (1991). [78] M. Montauti and F. C. Wezel, Charting the territory: Recombination as a source of uncertainty for potential entrants, Organization Science 27, 954 (2016). [79] Y. Tang and F. C. Wezel, Up to standard? Market positioning and perfor- mance of Hong Kong lms, 1975–1997, Journal of Business Venturing 30, 452 (2015). [80] M. Keuschnigg and T. Wimmer, Is category spanning truly disadvan- tageous? New evidence from primary and secondary movie markets, Social Forces 96, 449 (2017). [81] J. C. Lena and R. A. Peterson, Classication as culture: Types and trajectories of music genres, American Sociological Review 73, 697 (2008). [82] B. Kovács and M. T. Hannan, Conceptual spaces and the consequences of category spanning, Sociological Science 2, 252 (2015). [83] R. P. Rumelt, Strategy, Structure, and Economic Performance (Harvard University Press, Cambridge, MA, 1974). [84] R. P. Rumelt, Diversication strategy and protability, Strategic Man- agement Journal 4, 359 (1982). [85] B. Rehder, A causal-model theory of conceptual representation and categorization, Journal of Experimental Psychology: Learning, Memory, and Cognition 29, 1141 (2003). [86] B. Rehder, Categorization as causal reasoning, Cognitive Science 27, 22 Introduction

1 709 (2003). [87] L. W. Barsalou, Ad hoc categories, Memory & Cognition 11, 211 (1983). [88] B. H. Ross and G. L. Murphy, Food for thought: Cross-classication and category organization in a complex real-world domain, Cognitive Psychology 38, 495 (1999). [89] D. L. Medin, E. B. Lynch, J. D. Coley, and S. Atran, Categorization and reasoning among tree experts: Do all roads lead to Rome? Cognitive Psychology 32, 49 (1997). [90] B. H. Ross and G. L. Murphy, Category-based predictions: Inuence of uncertainty and feature associations, Journal of Experimental Psychol- ogy: Learning, Memory, and Cognition 22, 736 (1996). [91] S. Ratneshwar, L. W. Barsalou, C. Pechmann, and M. Moore, Goal- derived categories: The role of personal and situational goals in cate- gory representations, Journal of Consumer Psychology 10, 147 (2001). [92] S. Ratneshwar, C. Pechmann, and A. D. Shocker, Goal-derived cate- gories and the antecedents of across-category consideration, Journal of Consumer Research 23, 240 (1996). [93] D. N. Osherson and E. E. Smith, On the adequacy of prototype theory as a theory of concepts, Cognition 9, 35 (1981). [94] T. Wry, M. Lounsbury, and P. D. Jennings, Hybrid vigor: Securing ven- ture capital by spanning categories in nanotechnology, Academy of Management Journal 57, 1309 (2014). [95] N. Granqvist and T. Ritvala, Beyond prototypes: Drivers of market categorization in functional foods and nanotechnology, Journal of Management Studies 53, 210 (2016). [96] M. T. Kennedy and P. C. Fiss, An ontological turn in categories research: From standards of legitimacy to evidence of actuality, Journal of Man- agement Studies 50, 1138 (2013). [97] A. Barroso and M. S. Giarratana, Product proliferation strategies and rm performance: The moderating role of product space complexity, Strategic Management Journal 34, 1435 (2013). [98] G. L. Murphy and B. H. Ross, Induction with cross-classied categories, Memory & Cognition 27, 1024 (1999). [99] C. Navis and M. A. Glynn, How new market categories emerge: Temporal dynamics of legitimacy, identity, and entrepreneurship in satellite radio, 1990–2005, Administrative Science Quarterly 55, 439 (2010). [100] C. Jones, M. Maoret, F. G. Massa, and S. Svejenova, Rebels with a cause: Formation, contestation, and expansion of the de novo category “modern architecture,” 1870–1975, Organization Science 23, 1523 (2012). [101] O. Alexy and G. George, Category divergence, straddling, and currency: References 23

Open innovation and the legitimation of illegitimate categories, Jour- 1 nal of Management Studies 50, 173 (2013). [102] M. Khaire and R. D. Wadhwani, Changing landscapes: The construction of meaning and value in a new market category—Modern Indian art, Academy of Management Journal 53, 1281 (2010). [103] M. Khaire, The importance of being independent: The role of inter- mediaries in creating market categories, in From Categories to Cate- gorization: Studies in Sociology, Organizations, and Strategy at the Crossroads, Research in the Sociology of Organizations, Vol. 51, edited by R. Durand, N. Granqvist, and A. Tyllström (Emerald Group, Bingley, United Kingdom, 2017) pp. 259–293. [104] J.-P. Vergne and G. Swain, Categorical anarchy in the UK? The British media’s classication of Bitcoin and the limits of categorization, in From Categories to Categorization: Studies in Sociology, Organizations, and Strategy at the Crossroads, Research in the Sociology of Organi- zations, Vol. 51, edited by R. Durand, N. Granqvist, and A. Tyllström (Emerald Group, Bingley, United Kingdom, 2017) pp. 185–222. [105] A. Piazza and F. Perretti, Categorical stigma and rm disengagement: Nuclear power generation in the United States, 1970–2000, Organiza- tion Science 26, 724 (2015). [106] E. G. Pontikes and W. P. Barnett, The persistence of lenient market categories, Organization Science 26, 1415 (2015). [107] G. Negro, M. T. Hannan, and M. Fassiotto, Category signaling and reputation, Organization Science 26, 584 (2015). [108] E. G. Pontikes and M. T. Hannan, An ecology of social categories, Sociological Science 1, 311 (2014). [109] M. T. Hannan, Partiality of memberships in categories and audiences, Annual Review of Sociology 36, 159 (2010). [110] B. Kovács and M. T. Hannan, The consequences of category spanning depend on contrast, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 175–201. [111] G. Negro, M. T. Hannan, and H. Rao, Categorical contrast and audience appeal: Niche width and critical success in winemaking, Industrial and Corporate Change 19, 1397 (2010). [112] H. Rao, P. Monin, and R. Durand, Institutional change in Toque Ville: Nouvelle cuisine as an identity movement in French gastronomy, Amer- ican Journal of Sociology 108, 795 (2003). [113] H. Rao, P. Monin, and R. Durand, Border crossing: Bricolage and the 24 Introduction

1 erosion of categorical boundaries in French gastronomy, American Sociological Review 70, 968 (2005). [114] A. Barroso, M. S. Giarratana, S. Reis, and O. J. Sorenson, Crowding, satiation, and saturation: The days of television series’ lives, Strategic Management Journal 37, 565 (2016). [115] S. Grodal, A. Gotsopoulos, and F. F. Suarez, The co-evolution of tech- nologies and categories during industry emergence, Academy of Man- agement Review 40, 423 (2014). [116] F. F. Suarez, S. Grodal, and A. Gotsopoulos, ? Dominant category, dominant design, and the window of opportunity for rm entry, Strategic Management Journal 36, 437 (2015). [117] R. E. Kennedy, Strategy fads and competitive convergence: An empirical test for herd behavior in prime-time television programming, Journal of Industrial Economics 50, 57 (2002). [118] D. A. Levinthal, Adaptation on rugged landscapes, Management Sci- ence 43, 934 (1997). [119] P. M. Blau, A formal theory of dierentiation in organizations, American Sociological Review 35, 201 (1970). [120] R. Adner, L. Pólos, M. Ryall, and O. J. Sorenson, The case for formal theory, Academy of Management Review 34, 201 (2009). [121] G. L. Péli, J. Bruggeman, M. Masuch, and B. Ó Nualláin, A logical ap- proach to formalizing organizational ecology, American Sociological Review 59, 571 (1994). [122] G. L. Péli and M. Masuch, The logic of propagation strategies: Ax- iomatizing a fragment of organizational ecology in rst-order logic, Organization Science 8, 310 (1997). [123] M. T. Hannan, Rethinking age dependence in organizational mortality: Logical formalizations, American Journal of Sociology 104, 126 (1998). [124] J. Kamps and L. Pólos, Reducing uncertainty: A formal theory of orga- nizations in action, American Journal of Sociology 104, 1776 (1999). [125] L. Pólos, M. T. Hannan, and G. R. Carroll, Foundations of a theory of social forms, Industrial and Corporate Change 11, 85 (2002). [126] L. Pólos, M. T. Hannan, and G. Hsu, Modalities in sociological argu- ments, Journal of Mathematical Sociology 34, 201 (2010). [127] G. Hsu, M. T. Hannan, and L. Pólos, Typecasting, legitimation, and form emergence: A formal theory, Sociological Theory 29, 97 (2011). [128] G. L. Péli, Fit by founding, t by adaptation: Reconciling conicting organization theories with logical formalization, Academy of Manage- ment Review 34, 343 (2009). [129] J. Bruggeman, Niche width theory reappraised, Journal of Mathematical References 25

Sociology 22, 201 (1997). 1 [130] G. L. Péli, The niche hiker’s guide to population ecology: A logical reconstruction of organization ecology’s niche theory, Sociological Methodology 27, 1 (1997). [131] M. T. Hannan, On logical formalization of theories from organizational ecology, Sociological Methodology 27, 145 (1997). [132] K. Healy, Fuck nuance, Sociological Theory 35, 118 (2017). [133] L. Pólos and M. T. Hannan, Reasoning with partial knowledge, Socio- logical Methodology 32, 133 (2002). [134] L. Pólos and M. T. Hannan, A logic for theories in ux: A model-theoretic approach, Logique et Analyse 47, 85 (2004). [135] G. Le Mens, M. T. Hannan, and L. Pólos, Organizational obsolescence, drifting tastes, and age dependence in organizational life chances, Organization Science 26, 550 (2015). [136] G. Le Mens, M. T. Hannan, and L. Pólos, Age-related structural inertia: A distance-based approach, Organization Science 26, 756 (2015). [137] M. T. Hannan, G. Le Mens, G. Hsu, B. Kovács, G. Negro, L. Pólos, E. G. Pontikes, and A. J. Sharkey, Concepts and Categories: Foundations for Sociological Analysis, unpublished manuscript (2017). [138] G. Birkho, Lattice Theory (American Mathematical Society, New York, NY, 1940). [139] B. Ganter and R. Wille, Applied lattice theory: Formal Concept Analysis, in General Lattice Theory, edited by G. Grätzer (Birkhäuser, Basel, Switzerland, 1997) pp. 591–606. [140] B. Ganter and R. Wille, Formal Concept Analysis: Mathematical Foun- dations (Springer, Berlin/Heidelberg, Germany, 1999). [141] K. J. Lancaster, The economics of product variety: A survey, Marketing Science 9, 189 (1990). [142] M. Gehrke, Generalized Kripke frames, Studia Logica 84, 241 (2006).

I Empirical Studies

27

2 Categorization and Strategic Deterrence

This chapter is based on a research paper written in collaboration with Nachoem M. Wijnberg. Earlier versions of this paper were presented at the 2017 Annual Meeting of the Academy of Management (Atlanta, GA) and the 2017 Annual Conference of the Strategic Management Society (Houston, TX). An abridged version was invited for publication in the Best Papers section of the Academy of Management Proceedings. We are very grateful to Olav J. Sorenson and J. Cameron Verhaal for comments on initial ideas and thought-provoking discussions.

29 30 Categorization and Strategic Deterrence

2.1. Motivation By product proliferation, industrial economists refer to the competitive strategy whereby a rm extends its product oer in a specic market (or submarket) so as to ll the product space and minimize unmet demand 2 [1]. This is an especially common strategy in industries characterized by non-price competition, such as food [2, 3], chemical [4], or cigarette manu- facturing [5], where it is often used by the dominant incumbents to achieve and perpetuate a dierentiated oligopoly [6–8]. Schmalensee [9] oered an example from the US market for ready-to-eat cereals, where four con- glomerates, namely Kellogg, General Foods, General Mills, and Quaker Oats, introduced such a large number of marginally dierent products during the years 1950–1972 that they were brought to trial by the Federal Trade Com- mission.1 Shaw [4] reported a similar pattern in the UK chemical industry, where three large incumbents—Imperial Chemical, Fisons, and Shell—came to dominate the fertilizer market by proliferating extensively in 1958–1978. A few years later, Brander and Eaton [10] presented a compelling game- theoretic model whereby sequential strategic decisions with regard to new product introductions naturally lead to equilibria where a single rm mo- nopolizes close substitutes. The rationale is that, if the rm will not oer certain products, then its competitors will [cf. 11]. Proliferation serves to signal one’s commitment to a particular submarket or product category. To the extent that rms can be assumed to behave rationally, this signal represents a credible threat in the eyes of rivals [12]. These studies were the origin of an enormous stream of literature on product proliferation, which now ranges from game theory [13] to oper- ations research [14], marketing [15, 16], strategic management [17], and organizational ecology [18]. The empirical evidence amassed over nearly four decades suggests that proliferation strategies can have positive eects on a variety of rm outcomes, such as protability [19], market share [20], market power [3], and survival in the industry [21], provided that rms man- age to contain the relative increase in coordination costs [18, 22] and do not irritate consumers by burdening them with overchoice [2]. The mechanisms through which proliferation can beget competitive advantage include the exploitation of scope economies [8] and learning eects [23], the capacity to cater to more customized requests [21], and the “contrived deterrence” [6] of competition, which aects de novo and de alio entrants as well as

1Interestingly, the author found no evidence of collusion as the rms did not cooperate toward the purpose of becoming oligopolists. The anticompetitive eect that they enjoyed was “an unforseen, but presumably not unwelcome, consequence of a mode of behavior that arose more or less naturally from the industry’s structure” [9, p. 316]. Motivation 31 more established rivals. This latter eect, which arguably has the most direct and macroscopic implications for the structure of an industry [7], represents the focus of our arguments in this chapter. We believe this eect deserves additional consideration because, while game theorists agree that product proliferation makes a credible deterrent 2 strategy [13, 24–26], empirical research found little or no evidence of its power to discourage rival product introductions [15]. It is worthwhile to revisit this relationship because the determinants of new product entry have become increasingly important for organization theory as scholars started to address questions related to product demography [e.g., 27]. One of the most compelling insights oered by this line of research is that the survival rate of products in the market depends on distinctly ecological factors, such as the density of rms’ oerings in the various categories or niches whereby the market is partitioned [28]. It thus seems reasonable to conjecture that, by causing the saturation of particular product categories, proliferation strategies negatively aect the likelihood of rival product introductions. In light of this, the lack of evidence in favor of the “deterrence hypothesis” [15, p. 149] seems all the more surprising. We suspect that the absence of empirical support for this hypothesized causal link is due to the moderating inuence of category-level properties: these have not been accounted for by previous research on proliferation; however, as shown by many studies in organizational ecology [29–32], they can substantively alter the outcome of rms’ product strategies. Our objective is to theorize and test this moderating eect with regard to product proliferation. More specically, we analyze how the deterrent power of proliferation strategies varies with the level of complexity of the region of the product space where the strategy is enacted. In examining the link between product proliferation and performance, previous research has suggested that product space complexity, i.e., the degree of product heterogeneity within the targeted (sub)market, tends to multiply the strat- egy’s eects [17]. On the one hand, this is because greater returns accrue to organizational learning if the product spaces is complex, as specialized knowledge is more dicult to acquire [33] and consumers are more willing to pay for quality [16]. On the other hand, complex product spaces encour- age consumers to adopt simpler routines in their purchasing decisions, such as submarket loyalism [34], which enables rms with focused product lines to exploit psychological associations between their brand and the focal (sub)market [35]. Although this research has illuminated the moderat- ing role of product space complexity, it has considered only some of the benecial mechanisms underlying product proliferation, namely learning 32 Categorization and Strategic Deterrence

eects and demand synergies, and did not analyze the possible eects on the strategy’s capacity to deter competition. This gap is especially glaring if we consider that preventing rival product introductions is usually regarded as the main purpose of product proliferation [1]. 2 Beyond lling this gap, we contribute to the broader research on prolifer- ation in two important respects: First, we analyze the strategy’s eect on the number of future product introductions by new entrants as well as estab- lished rivals. This is relevant because, even decades after Schmalensee’s [9] foundational work, studies on proliferation largely concerned them- selves with entry deterrence and did not consider the barriers to rms’ movement across submarkets. In doing so, they missed what Richard Caves and Michael Porter [6, p. 249] termed “a great opportunity for generality.” Second, we examine product space complexity not as a characteristic of markets or industries [cf. 17], but rather as a property of individual product categories. To this purpose, we bridge the industrial-economic literature on market structure and the burgeoning research on categories in organization theory [36]. We also clarify how complexity diers from other properties of categories previously examined by organization scholars. Consistently with economic research [2–5, 9], we focus on product prolif- eration strategies enacted by the dominant incumbents in a dierentiated oligopoly. This allows for a more accurate analysis of strategic deterrence because these rms are normally active in multiple submarkets [37]: hence, their choice to saturate a particular submarket or category with their prod- ucts can be interpreted as a deliberate attempt to “seize” a region of the market and keep out rival companies.2 Moreover, it is usually the dominant incumbents within an industry who have the means to retaliate against competitors who encroach upon their territory: therefore, it is especially in the hands of these rms that product proliferation serves as a credible threat [8, 24, 26]. Our empirical setting is the US recording industry, 2004– 2014. This is an appropriate context for a study on product proliferation because the market for recorded music is notoriously oligopolistic [39, 40] and the various categories where record companies compete can vary in their degree of internal heterogeneity [41]. This setting is also convenient because the dominant incumbents (termed majors) form a recognizable strategic group and tend to monitor each other’s product line decisions [42, 43]. The companies outside this group (termed indies) form a motley collection that is generally observed only in the aggregate [cf.44].

2A similar interpretation may be unwarranted in the case of small or focused organizations, who may choose to extend their product oer in certain categories because this is the only option available to try and survive [38]. Theory and Hypotheses 33

This chapter is structed as follows: In Section 2.2, we discuss the extant literature on product proliferation and market structure in economics and organizational ecology. We establish a theoretical connection between the concepts of niche, category, and submarket, and build upon this link to develop our hypotheses about proliferation strategies, category complexity, 2 and strategic deterrence. In Section 2.3, we describe our empirical setting, data, sample, and statistical methods. In Section 2.4, we report the results of our analysis, and in Section 2.5 we discuss their implications for strategic management and organization theory.

2.2. Theory and Hypotheses 2.2.1. The Geometric Structure of Markets As shown by previous studies in industrial economics [4, 9, 13, 24], prod- uct proliferation can be appropriately conceptualized from a spatial or locational-analog perspective. This involves picturing the market as a mul- tidimensional metric space where each axis or dimension represents a relevant product characteristic [8], or feature [45], along which the objects can perceivably dier. In this abstract variant of Hotelling’s model [46], known in the economic literature as a locational-analog or Lancastrian3 model, consumers are dened as singletons, or points in space, which intu- itively correspond to the consumers’ ideal product specications. Firms, instead, are dened as sets of points, which correspond to the products they currently oer on the market. The location of products is usually considered xed because signicant costs tend to be associated with repositioning: compatibly with this, rms normally prefer to abandon unprotable prod- ucts rather than trying to move them [9]. However, both consumers and rms can change their positions over time: in the one case, this is because consumers’ preferences are inherently dynamic [28, 50]; in the other, it is because rms can introduce new products in order to keep up with shifts in demand [38] and react to other rms’ behavior [51]. Just like Hotelling’s example of a “market on Main Street” [46, p. 45], the proportion of demand captured by a rm at any given time is a function of the distance between the location of its products and the distribution of consumers. Because consumers are more likely to select products closer to their preferences, every product is associated with a catchment area that extends around its position in the feature space4 and ends as soon as

3After Kelvin Lancaster, who critically contributed to its development [8, 47–49]. 4Throughout the course of this chapter, the terms “product space” and “feature space” are used interchangeably. This equivalence is consistent with extant literature [17, 28]. 34 Categorization and Strategic Deterrence

2 y Feature

Feature x

Figure 2.1: A two-dimensional product space populated by two rms

any other product becomes closer. For the sake of illustration, consider a simplied market where the products can dier along two dimensions x and y . As displayed in Figure 2.1, the entire space can be represented as a Euclidean plane and the catchment area of each product is modeled as a Voronoi cell [52]. Competition ensues between the rms that populate this space (represented by dierent colors in the gure above) because demand is nite and a product’s likelihood to catch it depends not only on its distance from consumers’ preferences but also on the proximity of competing products [53]. If demand were uniformly distributed across the space, the protability of each product would be directly proportional to the size of its catchment area and thus the rm controlling more territory would outcompete the other. This assumption is clearly unrealistic, however: in most markets, demand is unevenly and polymodally distributed [54], and its concentration in particular areas of space can lead to the formation of a market center [55], that is, a highly competitive location in the space where large incumbents tend to ock because no other region can provide enough resources to guarantee their sustenance [56–58]. It is generally advantageous for rms to introduce new products at particular coordinates if, by doing so, they intercept demand that would Theory and Hypotheses 35 otherwise be caught by competitors’ products [24]. Of course, the potential benets of this course of action must always be weighted against the costs that developing, marketing, and distributing a greater variety of products entails. The costs of coordination [22] can be especially high because rm managers may have to divide their attention across a greater number of 2 production units. In extreme cases, these costs can grow so large as to cripple the rm’s operation and increase its rate of failure in the short term [18]. For this reason, rms cannot compete with one another by oering an ever-increasing number of products, thereby partitioning the space into ever-smaller catchment areas. An equilibrium is reached when the division of space is such that further product introductions are unprotable. Strategic decision-making thus involves searching for an optimal distri- bution of products across the feature space [25, 38]. This tends to be a very uncertain endeavor because rm managers do not know ex ante how de- mand is distributed, nor the timing of its shifts [59]. From this perspective, product competition is akin to a repeated game where rms periodically place their bids by positioning their oerings at chosen locations. At the end of each period, rm performance is non-negative if the total demand met by the rm’s products is sucient to recoup the total costs, including production, distribution, marketing, and coordination. Anything in excess of this threshold is prot and can be either reinvested in the next round or stored as a buer, but if performance is negative and the buer is depleted, the rm incurs failure [60]. Organizational learning occurs because the rms that survive consider past outcomes in future decisions [33]. Nevertheless, learning may only lead to a eeting advantage that does not guarantee survival in the face of rising competition [61]: in order to sustain their edge, rms must not only scout for protable positions in the feature space but also properly defend them against rivals’ intrusion. As argued both by the strategic literature on reference groups [62, 63] and by the ecological literature on resource partitioning [55, 58, 64], it is overly simplistic to presume that each player in a market competes in equal measure against every other player in the same market. Indeed, rms that occupy distant positions in the feature space do not target the same demand and may not even perceive one another as competitors [65]. What other rms are doing in distant regions is thus relatively inconsequential compared to what one’s neighbors are up to [66]. Similar considerations ap- ply to the demand side: consumers do not necessarily perceive their peers as worthy of social interaction if they have dierent preferences, and rather seek to establish ties with peers located in their proximity [67, 68]. Such mechanisms engender the compartmentalization of the feature space into 36 Categorization and Strategic Deterrence

bounded regions, which are relatively homogeneous in terms of both prod- uct features and consumer tastes. Industrial economists commonly refer to these regions as submarkets [69, 70], whereas organizational ecologists term them niches [71–73] or (product) categories [45]. 2 Inasmuch as consumers represent a resource that rms ought to acquire in order to sustain their operation, the Lancastrian model described above bears a direct correspondence to the ecological model that conceptualizes the market as a geometric resource space [57]. From an economic perspec- tive, each category can be viewed as a separate competitive arena within the broader landscape of an industry, much as if it were a regional market in a true locational (i.e., geographical) model [69]. Just like multinational corporations have a presence in multiple regional markets, rms within a given industry can maintain a foothold in multiple categories. If they encounter other rms in the same category, then they compete for the same share of resources [37]. This connection between economic and so- ciological views on market structure is useful to clarify the link between the concepts of niche, category, and submarket: these have been used somewhat interchangeably in previous research; in fact, each of them refers to a bounded region of the feature/resource space, and they are inter- changeable insofar as consumers in that region form a niche, products in that region form a category, and rms with products in that region vie to meet the same demand. This conceptual integration allows one to interpret the properties of categories studied by organization theorists, like leniency [74] and contrast [75], as attributes pertaining to submarkets within an industry, and conversely, to redene the structural attributes of markets, like product space complexity [17], at the level of submarkets so that they can be reinterpreted as category properties. Further, this theoretical connection allows one to distinguish product strategies by the changes they exert on a rm’s distribution of products across the various categories that tessellate the feature space. For example, diversication occurs either when the rm laterally expands by launching products in new categories [76], or when it pursues a more even distribution of products across the various categories where it currently competes [77]. Product proliferation, instead, occurs when the rm increases the number of products oered in a single category, so as to reinforce its presence in the corresponding region of the feature space [4,9, 13]. A proliferation strategy can be advantageous for many reasons: First, launching similar products in a relatively short time frame provides the rm with ample opportunities to sharpen its design skills and streamline its routines, which allows it to improve the quality of its products [23] and Theory and Hypotheses 37 thereby outcompete its rivals in the focal niche [71]. Second, oering more variations of a certain kind of product allows rms to meet consumers’ more unusual or sophisticated preferences, stimulating their loyalty [17, 21]. Third, and most important to the purpose of this chapter, lling a particular region of the feature space with one’s own products helps the rm increase 2 its market share by tightening the gaps in its current product oer. Smaller gaps result in smaller catchment areas, which means that smaller pockets of demand are available for rival products: therefore, competitors are less likely to improve their performance by positioning themselves in the rm’s neighborhood and are eventually pushed out of the category (Figure 2.2). This process, distinctly ecological in nature as it pivots on the density of rms’ product oer [28], is the reason why game-theoretic models nd product proliferation to be a viable deterrent strategy [e.g., 13]. Although it can erodes the sales of individual products [53], it can keep competitors away from the rm’s business: in this sense, rms can deploy this strategy to trade current prots for future submarket leadership [10, 11]. Quite surprisingly, this game-theoretic prediction is not corroborated by empirical ndings. For example, Bayus and Putsis [15] reported no evidence of deterrence in their analysis of the personal computer industry, 1981–1992: the authors justied their null result by arguing that deterrence may only emerge empirically as an artefact of model misspecication. In this chapter, we propose an alternative explanation grounded in ecological theory, namely that the level of complexity of the feature space undermines proliferation strategies’ capacity to deter rival product introductions. In the next section, we elaborate on this argument and theorize a moderating eect on the relationship between product proliferation and new product entry. For brevity, we sometimes use the word “complexity” in reference to the complexity of the feature space. It should be remembered that this complexity, which relates to the degree of heterogeneity in product attributes [17] as opposed to the sophistication of rms’ strategies [78] or of their organizational architectures [79], is the only kind we consider.

2.2.2. Complexity as a Category Property It is a widely accepted notion in economics and decision theory that ratio- nal agents in a market tend to structure their decision-making process so as to minimize their cognitive eort [80, 81]. Partitioning the competitive landscape into categories is crucial to this purpose because it enables cog- nitively limited agents to reduce the virtually innite distinctions between products on oer [cf. 82]. Owing to their inuence on such fundamen- tal cognitive processes, categories play a major role in the evolution and 38 Categorization and Strategic Deterrence

t

2 Category boundaries

t + 1 Category boundaries

t + 2 Category boundaries

Figure 2.2: Spatial consequences of product proliferation Theory and Hypotheses 39 consolidation of industries. This is testied by the importance placed on classication systems by analysts [83], critics [84, 85], and trade associa- tions [86]: maintaining clear distinctions is so vital to these gatekeepers that they actively discourage challenges to their schemata by penalizing rms that straddle category boundaries. For example, winemakers that mix 2 dierent styles of vineyard management tend to incur lower evaluations by critics [87] and patent applications that span technological domains are more readily denied by examiners [88]. Though rms sometimes begrudgingly comply with the prescriptions of these “boundary patrollers” [e.g., 89], they also nd categorization necessary to their eciency. By viewing the competitive landscape as a collection of relatively homogeneous regions instead of a continuous surface, rm managers can more easily select appropriate strategies [90], identify their rivals [65], and interpret other rms’ behavior [91]. Indeed, categories aect rm managers’ very perception of what constitutes an industry [92], as well as the various niches or submarkets it comprises [93]. Compatibly with this, category boundaries act as guidelines in their strategic choices; for example, by demarcating portions of the feature space where diversication or proliferation strategies may be coherently pursued [17]. Having limited cognitive resources at their disposal, consumers also nd themselves in need of categorization to achieve cognitive economy and make boundedly rational decisions. In point of fact, the way they search, compare, and evaluate products in a market tends to be governed by simple heuristics [94]. In particular, they heavily rely on categories to determine products’ likelihood to t their preferences [45]: by sifting, encoding, and conveying relevant product characteristics, category labels allow consumers to curtail the dimensionality of the feature space and discard whatever distinction between alternatives is irrelevant to their utility function [95]. Thanks to this lter, categories decrease the information load imposed by purchasing decisions to behaviorally and cognitively manageable propor- tions [cf. 82]. Given the centrality of this role, it is hardly surprising that categories come to determine the way consumers aggregate [67, 68] and communicate about products [96], steer the drift of their tastes [cf. 50], and guide the evolution of niches [28]. Marketing research suggests that consumers’ purchasing decisions tend to be more onerous in complex environments because these defy simpli- cation [97–99]. This resonates with cognitive-psychological literature: if the objects populating a domain have more heterogeneous features, fewer dimensions of the feature space can be ignored or compressed for the sake of cognitive economy. If a (sub)market is more complex, consumers 40 Categorization and Strategic Deterrence

must cope with a greater information load and they are likely to adopt even simpler heuristics, like restricting their consideration to products that possess certain attributes [34]. As noted by previous research [17], this tendency can ultimately benet proliferating rms because it allows them 2 to exploit stronger psychological associations between their brand and the focal category [35]. As a result, rms that proliferate in spaces of greater complexity are more likely to be selected by demand. In addition, these rms experience greater payos from organizational learning because spe- cialized knowledge tends to be more valuable in complex environments [33] and consumers who care about subtler distinctions are more willing to pay for products that meet their sophisticated expectations [16]. In previous studies, complexity was assumed to be a property of markets or industries [17]. A market is more complex than another (or than itself at a previous time period) if there is greater variance in the attributes of products: in our spatial model, this occurs either if the range of possible values for existing features becomes wider, so that the products can dier more substantially along the space’s current dimensions, or if entirely new features become relevant, so that the total number of dimensions increases. Very similar arguments underpin the evolutionary literature on tness land- scapes [61]. The main dierence between our model and the NK -models commonly adopted in this literature is that, in NK -models, rms are charac- terized as singletons in a space of organizational features. The complexity of a tness landscape is considered a function of the number N and interre- latedness K of the rms’ attributes, and it is often studied in relation to the rms’ ability to move around in search for a better architecture [79]. There are interesting connections between these two models, as rms’ position in a tness landscape partly depends on the coordinates of their products in a Lancastrian space. Moreover, the complexity of a product space can be closely related to the sophistication of the organizational architecture a rm requires to compete successfully. Complexity is thus a variable related to the structure of the feature space, and it is a variable that can change over time. Its value can in- crease as a result of shifts in laws and regulations [17], as well as radical innovation and technological discontinuities, which can lead to greater product heterogeneity. For example, the feature space of personal com- puters grew considerably more complex during the years 1981–1992, as the number of machines with distinct technical specications increased from approximately two hundreds to two thousands [15]. It is also possible for complexity to decrease over time, especially as a result of imitation or competitive convergence [51]. For example, the market for popular music Theory and Hypotheses 41 became gradually simpler in 1971–1988, as the artists’ output became more and more similar in terms of aesthetic features [100]. One of the key argu- ment we make about complexity, therefore, is that it should be considered a dynamic or time-variant property. It is possible to measure product space complexity at the level of an 2 industry, and it is arguably correct to do so when studying competition among rms that engage in unrelated diversication [90]. In examining rms’ strategies within a single industry, however, it is crucial to note that the tessellation the feature space can lead to variance in complexity even within the boundaries of a particular industry. This can ensue because the products included in each category can dier more or less substantially along particular dimensions. For example, the disk storage capacity of laptops varies in a relatively small interval compared to desktop computers as the number of hard drives a laptop can accommodate is constrained by the size of its chassis. It can also ensue because the dimensions relevant to product comparisons dier across categories; for example, weight may not be a relevant feature to consider in the case of desktop computers but it is decidedly important for laptops. As a result of this variation, the strategic outcomes experienced by a rm within a certain category can dier from those experienced in another category within the same industry—or even those experienced in the same category at a previous time period. Given these considerations, we propound that complexity aects the degree to which product proliferation strategies are conducive to deterrence. This argument is justied by the spatial model above: more complex regions of the product space are naturally harder to occupy because consumers tend to care about ner-grained distinctions and competitors are more likely to nd gaps to exploit. As mentioned above, greater complexity arises either if a category encompasses a broader portion of the feature space or if more features are relevant to membership in the category.5 Either condition makes the space included in the category more dicult to hold against competitors. An analogy with military strategy may be particularly useful to illustrate this point: given a xed amount of outposts (products) to be positioned across the territory, it is harder for an army (rm) to control a state (category) like Kansas than one like North Dakota; although both are relatively at, Kansas includes nearly 20-percent more land. It may be harder yet for the army to control a state like Utah, which is similar to Kansas in land size but encompasses a much more rugged terrain. Securing a vast and mountainous territory like Alaska, instead, may prove well-nigh

5This is equivalent to saying that the category has greater dimensionality [101]. 42 Categorization and Strategic Deterrence

impossible, because the topology of the landscape limits the contribution of each outpost to the army’s capacity to keep watch.6 Consistently with this example, we expect rms that engage in product proliferation within more complex regions of the feature space to experience 2 a decrease in their capacity to keep competitors at bay. In other words, the deterrent power of product proliferation should be reduced by the complexity of the targeted category. Consistently with the game-theoretic literature [13], we anticipate a baseline negative relationship between the rm’s number of new product introductions in a given category and the likelihood of new product introductions in the same category by the rm’s rivals: however, we expect this relationship to be positively moderated by the level of complexity of the focal category. In other words, we predict that regions of the feature space that are more heterogeneous in terms of product features witness a weaker deterrent eect as a result of product proliferation strategies. If the category’s complexity is suciently high, it may even be possible for this eect to disappear entirely, leading to an apparently null relationship [cf. 15]. Hence our two hypotheses:

Hypothesis 2.1. Engaging in product proliferation within a certain category decreases the likelihood of rival product introductions in that category.

Hypothesis 2.2. The more a category is complex, the less product prolifera- tion decreases the likelihood of rival product introductions in that category.

2.3. Methodology 2.3.1. Empirical Setting We test our predictions by analyzing rms’ patterns of new product introduc- tions in the US recording industry, 2004–2014. This is an appropriate setting for our study because the industry is dominated by non-price competition. Indeed, copyright protection ensures that record companies cannot sell exactly the same product as their rivals. Just like food [2, 9] or chemical manufacturing [4], the companies primarily compete through dierentia- tion, and while their pricing conduct is approximately cooperative—because products that adhere to the same technological format tend to be equally priced regardless of their features—their choices with respect to adver- tising [102] and positioning [43] are most denitely not. Another reason for choosing the recording industry is that rms can surmise consumers’

6For simplicity, our example assumes that rivals can enter the territory at any location and not just at its boundaries, much like rms can introduce new products at any coordinates within a certain region of the feature space [13]. Methodology 43 preferences from the information made available by Billboard in the form of ranking charts [103]. These charts are regularly used to motivate decision with regard to the allocation of organizational resources: to quote Roger Karshner [104, p. 115], former vice-president of , “Everybody in the record business is constantly lipping chart potentials, trade picks, 2 and chart life. In fact, the entire industry rises and falls upon the waves of this silly number game.” This means that record companies need not rely on monitoring [38] to keep track of consumers’ shifting tastes, and those occupying the more protable positions in the feature space, such as the market center, must face the threat of rival product introductions. The US market for recorded music is also suitable for our analysis because it is decidedly oligopolistic. The business has been historically concentrated in the hands of a few major record companies [39, 40]: large, diversied, and vertically integrated organizations [105], which outclass the independent companies in terms of production, marketing, and dis- tribution capacity [43, 106]. Each major directly or indirectly controls a host of subsidiaries and imprints, and in the way they allocate resources across these dierent production units they are fully comparable to multi- divisional corporations [39]. At the outset of our study period, the group of majors included Sony BMG Music Entertainment (renamed Sony Music Entertainment in 2008), Warner Music Group, Universal Music Group, and EMI Group. At the end of 2011, EMI disbanded and its assets were acquired by Sony and Universal. Currently, the three surviving majors collectively control between 64 and 86-percent of the market,7 account for two-thirds of all the sales, and tally 3.2 billion USD in yearly revenue [107]. Because the tastes of consumers in this market are notoriously mercurial [108], music products tend to have a short lifecycle. While some continue to sell for decades after their original release, the vast majority remains unnoticed and the few that reach some degree of popularity usually exhaust it within the course of a year [106, 109]. Owing to this rapid turnover, the yearly production of new records represents “the bedrock assumption on which the entire commercial music industry is constructed” [103, p. 280]. To be ecient in this respect, record companies maintain Artists & Repertoire (A&R) departments, the industry’s equivalent of R&D, whose mission is to scout for talent, oer contracts to artists, and work as the companies’ liaison during the recording process. The marketing and distribution of the

7They directly own about 64-percent of all the active record companies; however, they control 86-percent via ownership of the rms’ distribution channels [44]. More than 50- percent of the indies rely on majors to distribute their records because they lack the infrastructure to be entirely independent. 44 Categorization and Strategic Deterrence

artists’ output is usually the responsibility of record companies: despite the opening of digital channels, self-released or self-distributed products are rare and they are only tolerated by the record companies because they serve to stimulate retail purchases [110]. 2 Music products are commonly classied according to a well-established genre system [111]. Consumers care about products’ memberships in genres because these categories are key to their social identity [67, 68]; in addition, genres are of utmost relevance to gatekeepers such as music critics, who tend to maintain their institutionalized boundaries [84]. As in other creative industries [51], the major companies are active in all genres that have sucient appeal for mainstream consumers, and the close distance they maintain between each other in terms of market positioning [42] suggests a strong proclivity to imitation. Although intellectual property law prevents them from oering exactly the same products as their rivals, nothing stops them from producing relatively similar records. Deterrence is thus extremely important: the majors’ A&R departments have been known to “hoard” artists procient in a certain genre purely for the purpose of preventing other rms from releasing similar music. As reported by Casey Rae, former deputy director of the Future of Music Coalition [112]:

Maybe it makes sense to sign you, get you under contract, and keep you o the streets, so nobody else has you. But they do not actually care if they do anything with you. If a garage sound was popular like The White Stripes and now The Black Keys, then maybe they just sign up all the White Stripes- and Black Keys- sounding bands. Lord knows that happened during the era, where A&R guys were literally jumping out of airplanes with briefcases over the city of Seattle. “Sign anything with a goatee!”

In summary, four characteristics make the recording industry compara- ble to other contexts where proliferation strategies were previously exam- ined by researchers, namely product dierentiation, oligopolistic structure, multiple point competition, and an emphasis on strategic deterrence. In our study, we analyze the extent to which product proliferation by the major record companies turns out to be eective at preventing rival product intro- ductions, including both those by other majors and those by independents. Our choice to focus on the majors is partly dictated by the specics of our empirical context: because proliferation requires rms to sign a greater number of artists, only companies with a sucient amount of resources can aord to pursue this strategy. The indies tend to be excluded from this group because they are usually smaller, more specialized, and they Methodology 45 produce a relatively small number of records per year [43]. Though this limits the generalizability of our ndings to proliferation strategies enacted by large and diversied organizations, it makes our analysis consistent with previous research in industrial economics, which similarly focused on the dominant incumbents [e.g.,3–5,9]. 2

2.3.2. Sample and Variables Our data is drawn from two online databases. The rst is Billboard.com: as mentioned above, this is an authoritative source of information on the commercial success of music products. Owing to their central role in the music business, Billboard charts are frequently used in empirical research to compute proxies for product sales [106, 113] or popularity [114]. In our analysis, we use data from the Billboard 200, one of the website’s signature charts, which lists the top 200 every week by number of units sold in the US. This includes both physical and digital sales, as measured by Nielsen SoundScan through a sample of physical retailers and online music stores. Our study period begins in 2004, when Sony and Bertelsmann launched their joint venture and became a unied force in the US market, and terminates in 2014, when the Billboard 200 underwent important changes in the way Nielsen data is aggregated. This amounts to 574 weeks of chart information. Our second source of data is Discogs.com, an extensive, user-contributed archive that provides release dates, genre memberships, and links to record companies for more than 1.2 million original products. This data has been used in previous research to analyze rms’ product strategies in the music industry because it adequately reects the boundaries of genres according to both consumers and record companies, and it is reasonably free from retrospective bias [115, p. 959]. To ensure the accuracy of rm-product relationships and release dates, we cross-referenced our data with another online archive, MusicBrainz.com. Our nal dataset includes 73,722 original products released in the US during our study period. No more than 4,330 of these ever appeared on the Billboard 200. The products are distributed across 14 categories: blues; brass and military; children’s; classical; elec- tronic; folk, world, & country; funk/soul; hip hop; jazz; Latin; pop; rock; reggae; stage and screen.8 Figure 2.3 reports the number (both logged and untransformed) of products in our dataset assigned to each category. It is possible for a product to be assigned to multiple genres: in this case, it is counted once per category. Figure 2.4 shows the genres’ positions relative to

8Discogs also includes a fteenth category, non-music, which is used for commentaries and interviews. As this does not represent a , it is excluded from our analysis. 46 Categorization and Strategic Deterrence

2

Figure 2.3: New product introductions in each genre, 2004–2014

one another, computed via Kruskal’s non-metric multi-dimensional scaling (NMDS) algorithm [116]. The distance between two categories on this map decreases with the number of products assigned to both. Approximately 11.2-percent of records in our dataset were produced by one of the four major record companies, or any of their respective sub- sidiaries and imprints. All the others were produced by 15,491 independents. Consistently with the industry’s reputation as a highly concentrated busi- ness, most of the independents’ products turn out to be commercial ops: out of all those that appeared on the Billboard 200, 45.3-percent were pro- duced by one of the four majors or any of their subsidiaries, and the rest were produced by 840 indies. Figure 2.5 presents the yearly number of new products introduced by each major across all genres during our study pe- riod. As the red line shows, only a small number of products were released in the US by EMI (which was UK-based) in the years prior to its break-up. In our analysis, we retain the company within the group of observed majors, but its exclusion does not aect our empirical results. Methodology 47

2

Figure 2.4: Map of Discogs genres computed via Kruskal’s NMDS

Dependent variable. Each observation in our analysis is a major-genre- year triple. As three of the four majors are followed for 11 years, one is followed for eight, and the data covers 14 genres in total, our sample includes 574 observations. The dependent variable corresponds to the total number of products released by rival record companies within the focal genre during the given year. This includes both products released by other majors and those released by independents. Formally, for each major m1 ... m4, each category c1 ... c14, and each year t1 ... t11 (or t1 ... t8, in the case of EMI), we compute the following sum:

J D Õ Õ RivalProductsmct = pi ct + pi ct , (2.1) i =1 i =1

where J is the set of majors other than m, D is the set of indies, and pi ct is the number of products released by rm i in category c during year t . 48 Categorization and Strategic Deterrence

2

Figure 2.5: Yearly number of releases by the major record companies

Independent variables. To test our hypotheses we compute two predic- tors of theoretical interest, namely the number of records released by the focal major in the focal genre during the year of observation, and the complexity of the focal genre during the same year. The rst variable (OwnProducts) is a simple count of the products introduced by major m in category c during year t . The second variable (Complexity) is more so- phisticated and requires detailed explanation. Ideally, a category-based measure of feature space complexity should increase with the degree of heterogeneity in the attributes of category members [cf. 17], and it should be allowed to vary from year to year depending on the characteristics of new products released within the category. Although we have no access to detailed information about product attributes, the hierarchical nature of the genre system allows us to obtain a proxy for the categories’ internal diversity. In fact, each genre on Discogs is associated with a number of subcategories, or styles, whose boundaries correlate with variations in the musical attributes of products in the genre, such as lyrical themes, instrumentation, symbolic elements, and harmonic, Methodology 49 melodic, or rhythmic conventions [111]. The degree of heterogeneity within a genre is thus inversely related to the concentration of category members across the styles subordinate to the genre. More specically, the diversity of product attributes is minimal if all the products belong to the same style, and maximal if they are uniformly distributed across every possible 2 style. Consistently with this reasoning, we measure complexity by way of a diversity index. To this purpose we use a measure known in ecology and information theory as Shannon’s H [117]: this index, which allows the measurement of entropy in data, was used in previous research to quantify precisely the diversity of recorded music [e.g., 100]. Formally, for each year t and category c, the variable is computed as follows:

S   Õ pi t pi t Complexity = − ln , (2.2) ct p p i =1 ct ct where S is the set of styles s1 ... sn subordinate to genre c. Because the argument of the sum is a non-positive number, the resulting variable is always non-negative. Figure 2.6 presents the yearly level of complexity for the ve genres for which the mean level of the variable is highest.

Control variables. In addition to complexity, genre categories can vary in a number of other properties. Especially relevant to our arguments are contrast [30, 32] and leniency [74], which concern the fuzziness of category boundaries. In particular, contrast measures the extent to which products’ membership in a category tends to be exclusive, in the sense that category members do not simultaneously belong to other categories in the same domain. Leniency, instead, measures the number of other categories with which the focal category overlaps (if any). This is dierent from contrast because while a high-contrast category tends to overlap with a small num- ber of others, a low-contrast category can overlap with few or many. Both properties are important to control for because rms can be more inclined to introduce new products in high-contrast or lenient categories because of the advantages they oer [32, 118]. Consistently with extant research [75], we calculate Contrast for each category c during year t as the mean grade of membership of products released in c at t , where a product’s grade of membership is the reciprocal of the number of genres to which it belongs. The resulting variable ranges between zero and one, with a higher value representing a category with sharper boundaries. Leniency, instead, is computed as the natural logarithm of the total number of genres to which 50 Categorization and Strategic Deterrence

2

Figure 2.6: Yearly level of complexity for the ve most complex genres

the products released in c at t belong.9 The measure is non-negative, with a higher value indicating greater leniency. Another determinant of new product introductions that varies at the genre level is the concentration of demand. This is important to control for because categories for which there is greater demand can support a greater number of products before they are saturated [28]. For each category c and year t , we measure Demand by the proportion of products released in c during t that enter the Billboard 200, relative to the total number of products released during t that enter the Billboard 200. This closely mirrors the measure used by Kennedy [51] in his analysis of the broadcasting industry. Figure 2.7 displays the yearly level of demand for the top ve categories in our sample. As the plot lines show, rock was by far the most popular genre in 2004–2014, accounting for over 50-percent of products on the Billboard 200 during any given year. The category pop also

9In previous research [74], leniency was computed by multiplying this logarithm with the reverse of contrast. We avoid this approach because contrast is a separate variable in our analysis and the multiplication would induce collinearity. Methodology 51

2

Figure 2.7: Yearly level of demand for the ve most popular genres gained some traction over the course of our study period, whereas demand for hip hop consistently decreased. All the other genres are clustered at values between zero and 20-percent. At the rm level, we compute three control variables that capture yearly variation in the rms’ positioning and competencies. The rst such variable (Incumbent) measures a rm’s presence in the focal genre: for each major m, category c, and year t , this equals the number of products released by m in c during t over the total number of products released by m during t [cf. 51]. The second variable (Strength) measures the rm’s prociency in the focal genre, and for each m, c, t it equals the number of products released by m in c during t that enter the Billboard 200, over the total number of products released by m in c during t [cf. 51]. The third and nal variable (Span) measures the extent to which the focal major straddles the boundaries of a genre. This is relevant to control for because products in multiple genres are located in-between dierent regions of the feature space [119] and thus they tend to be constrained in their feature space coordinates. A rm may not be able to adequately ll the region of space 52 Categorization and Strategic Deterrence

that corresponds to a genre if it only releases products with multiple genre memberships. For each m, c, t , this variable equals the mean number of genres of products released by m in c during t .

2 2.3.3. Estimation Procedure Because our dependent variable (Equation 2.1) is a count, an appropriate statistical model is necessary to accommodate its bounded and discrete nature. A common approach is to apply a logarithmic or square-root trans- formation and use the OLS estimator, but this is often suboptimal because the transformed variable remains non-negative and this can violate the assumption of normality in the distribution of OLS residuals [120]. An alter- native solution is to use a Poisson model [121], which is estimated through the log-likelihood parameterization of the Poisson probability distribution. A distinguishing trait of this model is that the variance of the distribution is assumed to be equal to the mean: though this assumption is reasonable in some cases, it can lead to biased estimates when the true distribution whence the dependent variable is drawn has a variance that exceeds the mean. In these circumstances, referred to as overdispersion, other models should be used that allow the variance and the mean to dier [122]. Two models commonly used to account for overdispersion are the quasi- Poisson and the negative binomial. Both belong to the family of generalized linear models, and they express the expected value E (Y ) = µ of the de- pendent variable as a function of the linear combination of regressors, namely µ = g −1 (βX), where g −1 is the exponential function, β is a vec- tor of parameters, and X is the model matrix. Alternatively, g (µ) = βX, where g is the logarithmic function. Quasi-Poisson regression assumes the observed value of the dependent Y ∼ Poi (µ, θ), and var (Y ) = θµ; negative binomial regression, instead, assumes Y ∼ NB (µ, κ), and var (Y ) = µ + κµ2. The dierence, therefore, is that the quasi-Poisson model assumes the variance to be a linear function of the mean, whereas the negative binomial model assumes it to be a quadratic function. As a result, the negative binomial gives observations with a smaller value of the dependent greater weight in the regression relative to the quasi-Poisson, and conversely, the quasi-Poisson attributes greater weight to the observations with higher values. The choice between these two models thus depends on which of the two weighings is preferable for the research question at hand [123]. As shown in Figure 2.8, our dependent variable (RivalProducts) is overdis- persed, which makes the standard Poisson model inappropriate. Among the two alternatives discussed above, the quasi-Poisson appears preferable because it gives smaller weights to peripherical genres (cf. Figure 2.3), where Methodology 53

2

Figure 2.8: Probability function of the dependent variable the record companies’ activity is low. We use a xed-eects specication to account for systematic dierences across majors, genres, and years. All the regressors are lagged by one year, which causes the loss of 56 observations relative to 2004. The full model takes the form:

ln RivalProductsmc(t +1) = α + β1Major2 ··· β3Major4 + β4Genre2 ··· β16Genre15

+ β17Year2 ··· β25Year10 + β26Contrastct + β27Leniencyct

+ β28Demandct + β29Incumbentmct + β30Strengthmct

+ β31Spanmct + β32OwnProductsmct + β33Complexityct

+ β34OwnProductsmct × Complexityct + ε. (2.3)

where ε is drawn from a distribution N (0, σε). Our results suggest that this model is appropriate: the dispersion parameter θˆ is much larger than one. 54 Categorization and Strategic Deterrence

Table 2.1: Descriptive statistics

Mean Std. dev. Min. Max. 2 RivalProductsmc(t +1) 581.41 992.02 0 4570 Major 2.38 1.08 1 4 Genre 7.50 4.04 1 14 Year 2009.22 2.79 2005 2014 Contrastct 0.62 0.16 0.26 1.00 Leniencyct 2.44 0.46 0 2.77 Demandct 0.10 0.15 0 0.64 Incumbentmct 0.07 0.09 0 0.50 Strengthmct 0.15 0.17 0 0.67 Spanmct 1.66 1.25 0 12.00 Complexityct 1.85 0.94 0 3.39 OwnProductsmct 21.70 36.37 0 258

2.4. Results Table 2.1 reports the descriptive statistics of the variables involved in our analysis. The pairwise correlations between these variables are presented in Table 2.2. We nd highly signicant correlations between the outcome variable and most of the predictors; in addition, some of the predictors are strongly correlated with one another, especially in the case of Incumbent and Demand (r = 0.764, p = 0.000), and Incumbent and OwnProducts (r = 0.775, p = 0.000). To assess the risk of multicollinearity, we perform conditioning diagnostics on the model matrix. The condition number is 43.27, which is above the threshold of 30 recommended by Belsley, Kuh, and Welsh [124]. Standardizing the matrix helps returning this number to a more acceptable 5.34. In the tables below, we strictly report estimates from standardized variables. For robustness, we also replicate our analysis after excluding the problematic variable Incumbent: in this case, the condition number is further reduced to 4.89, but the results are nearly identical. Our main analysis involves the estimation of six nested quasi-Poisson models. The results from our control-only specications are reported in Table 2.3. We begin by estimating a baseline model that includes only control variables and xed eects for observation years (Model 1). The model’s residual deviance of 54,043 suggests a relatively poor t for the data, but this value decreases substantially with the addition of the other xed eects: in Model 2, where the genre dummies are included, it is equal Results 55

Table 2.2: Pairwise correlations matrix

1 2 3 4 1 RivalProductsmc(t +1) 2 2 Major −0.01 3 Genre 0.35••• 0.00 4 Year 0.06 −0.15••• 0.00 ••• ••• 5 Contrastct 0.60 0.01 0.23 −0.07 ••• ••• 6 Leniencyct 0.24 0.00 0.40 0.02 ••• ••• 7 Demandct 0.84 0.00 0.50 0.03 ••• ••• 8 Incumbentmct 0.65 −0.06 0.41 −0.01 ••• ••• ••• ••• 9 Strengthmct 0.24 −0.27 0.20 0.16 ••• •• 10 Spanmct 0.00 −0.35 0.14 0.06 ••• ••• 11 Complexityct 0.67 0.00 0.42 −0.01 ••• • ••• • 12 OwnProductsmct 0.52 −0.10 0.33 −0.11 5 6 7 8 •• 6 Leniencyct −0.13 ••• ••• 7 Demandct 0.49 0.27 ••• ••• ••• 8 Incumbentmct 0.46 0.26 0.76 •• ••• ••• ••• 9 Strengthmct 0.12 0.30 0.37 0.35 ••• • 10 Spanmct −0.07 0.32 0.00 0.10 ••• ••• ••• ••• 11 Complexityct 0.55 0.57 0.52 0.48 ••• ••• ••• ••• 12 OwnProductsmct 0.43 0.22 0.65 0.77 9 10 11 12 ••• 10 Spanmct 0.37 ••• ••• 11 Complexityct 0.32 0.22 ••• • ••• 12 OwnProductsmct 0.30 0.09 0.41

Note: • p < 0.05; •• p < 0.01; ••• p < 0.001

to 3,572.3, and in Model 3, where the major dummies are also included, it is further reduced by 58. The estimates from this latter model suggest that Contrast has a positive eect on RivalProducts (βˆ = 0.462, p = 0.000): a one- std. dev. increase in the value of Contrast multiplies the expected number of rival product introductions in the following year by e0.462 = 1.587, which means that the dependent variable increases by 58.7-percent. We also nd a positive eect for Leniency (βˆ = 0.238, p = 0.000), where a one-std. dev. increase leads to a 26.9-percent increase in the dependent, and Demand 56 Categorization and Strategic Deterrence

Table 2.3: Quasi-Poisson model results: Controls

Dependent variable: 2 RivalProductsmc(t +1) Model 1 Model 2 Model 3 Intercept 4.66 ••• (0.10) 4.99 ••• (0.05) 4.99 ••• (0.05) ••• ••• ••• Contrastct 1.13 (0.04) 0.45 (0.05) 0.46 (0.05) ••• ••• ••• Leniencyct 2.40 (0.17) 0.24 (0.05) 0.24 (0.05) ••• ••• ••• Demandct 0.14 (0.03) 0.43 (0.03) 0.43 (0.03) Incumbentmct −0.01 (0.02) −0.00 (0.00) −0.00 (0.01) •• • Strengthmct −0.03 (0.03) −0.02 (0.01) −0.02 (0.01) Spanmct 0.08 (0.04) 0.00 (0.01) 0.02 (0.01) Dispersion θ 158.91 7.40 7.33 Major dummies Not included Not included Included Genre dummies Not included Included Included Year dummies Included Included Included No. observations 518 518 518 Deviance 54043 3572.3 3514.3

Note: • p < 0.05; •• p < 0.01; ••• p < 0.001; std. errors in parentheses

(βˆ = 0.427, p = 0.000), where a one-std. dev. increase leads to a 53.3-percent increase in the dependent. Finally, we nd a negative eect for Strength (βˆ = −0.018, p = 0.036), where a one-std. dev. increase is associated with a 1.8-percent decrease in the dependent. This suggests that rival rms tend to avoid categories where the focal rm is more successful. In the following models, we add our predictors of theoretical interest to the list of regressors. The results of these estimations are reported in Table 2.4. In Model 4, we add the rst predictor, Complexity, and nd that this signicantly contributes to model t as the deviance of residuals decreases by 308.61. The variable has a positive eect on RivalProducts (βˆ = 0.426, p = 0.000): namely, a one-std. dev. increase in Complexity causes the expected count of rival products released during the following year to increase by 53.1-percent. In Model 5, we include our second predictor of interest, OwnProducts, and detect a negative but non-signicant eect (βˆ = −0.001, p = 0.879). The addition of this variable to the model leads only to a marginal decrease in deviance, which seems to suggest that the number of products released by the rm in the focal category has no eect Results 57

Table 2.4: Quasi-Poisson model results: Main eects and interaction

Dependent variable: RivalProductsmc(t +1) 2 Model 4 Model 5 Model 6 Intercept 4.99 ••• (0.05) 4.99 ••• (0.05) 4.99 ••• (0.05) ••• ••• ••• Contrastct 0.43 (0.05) 0.43 (0.05) 0.44 (0.05) ••• ••• ••• Leniencyct 0.18 (0.05) 0.18 (0.05) 0.19 (0.05) ••• ••• ••• Demandct 0.36 (0.03) 0.36 (0.03) 0.38 (0.03) Incumbentmct −0.00 (0.00) 0.00 (0.01) −0.00 (0.01) • Strengthmct −0.01 (0.01) −0.01 (0.01) −0.02 (0.01) Spanmct 0.02 (0.01) 0.02 (0.01) 0.02 (0.01) ••• ••• ••• Complexityct 0.42 (0.06) 0.43 (0.06) 0.44 (0.06) • OwnProductsmct −0.00 (0.01) −0.03 (0.01) Complexity ct 0.02 •• (0.01) × OwnProductsmct Dispersion θ 6.77 6.78 6.70 Major dummies Included Included Included Genre dummies Included Included Included Year dummies Included Included Included No. observations 518 518 518 Deviance 3205.7 3205.6 3156.5

Note: • p < 0.05; •• p < 0.01; ••• p < 0.001; std. errors in parentheses on the number of products released by the rm’s rivals in the following year. This null result is consistent with the analysis of the personal computer industry performed by Bayus and Putsis [15], but the question arises of whether the lack of empirical evidence is due to the actual absence of deterrence or rather to the fact that this model ignores the interaction with Complexity, as we suggest in this chapter. We investigate this possibility by adding the interaction term to our list of regressors (Model 6). The model estimates support our proposition: the deviance of residuals decreases by a more substantial 49.1, the negative eect of OwnProducts becomes signicant (βˆ = −0.028, p = 0.041), and the variable’s interaction with Complexity has a positive and signicant eect (βˆ = 0.023, p = 0.007). These coecients imply that if the value of OwnProducts increases by one std. dev. while Complexity is held at its 58 Categorization and Strategic Deterrence

mean, the expected count of rival products in the following year decreases by 2.8-percent. If Complexity simultaneously increases by one std. dev., however, then the negative eect of OwnProducts is oset by a 2.3-percent increase in the dependent: therefore, the expected value of RivalProducts 2 only decreases by 0.5-percent. If, instead, the value of Complexity is one-std. dev. lower, then the negative eect of OwnProducts further increases by 2.3-percent, which means that RivalProducts decreases by 5.1-percent. Thus, the extent to which product proliferation reduces the likelihood of rival product introductions is greater in less complex regions of space.

Additional analysis. Though these results support our argument that the deterrent power of proliferation strategies is contingent on the level of complexity of the feature space, the parameter estimates strike us as small and it is reasonable to ask whether this is because proliferation strategies are relatively ineective or because our outcome variable does not dier- entiate between products released by other majors and those released by independents (Equation 2.1). It could be the case that the eect is primarily driven by the other majors: these rms are diversied enough to actually have a choice as to where to position their products in the feature space, whereas indies may be constrained in their capacity to reposition them- selves because they tend to be more specialized and thus constrained to particular categories. As a result, moving to a dierent region of the feature space can pose a greater threat to their survival than holding their ground and ghting an uphill battle against the proliferating major. If this were the case, the inclusion of the indie component in the dependent variable would blunt the eects of OwnProducts and its interaction with Complexity, pulling their estimated coecients downward. We formally test this possibility by splitting our dependent variable into MajorProductsmc(t +1) and IndieProductsmc(t +1). Consistently with Equation 2.1, ÍJ the rst variable is computed as i =1 pi c(t +1), where J is the set of other ÍD majors, and the second is computed as i =1 pi c(t +1), where D is the set of indies. We enter these variables as regressands in two quasi-Poisson models with nearly identical specications, which dier only in that the rst also includes the total number of products released by indies in the same category during the previous year (IndieProductsmct ), whereas the second includes the total number of products released by other majors during the previous year (MajorProductsmct ). These extra variables are necessary to take into account that, in addition to the products introduced by the focal major, future product introductions by other majors (resp. by independents) are partly driven by the number of products introduced in the same category Results 59

Table 2.5: Quasi-Poisson model results: Additional analysis

Dependent variable: MajorProductsmc(t +1) IndieProductsmc(t +1) 2 Model 6 Model 7 Intercept 3.55 ••• (0.09) 4.68 ••• (0.05) ••• IndieProductsmct −0.14 (0.04) MajorProductsmct 0.02 (0.01) ••• ••• Contrastct 0.48 (0.08) 0.45 (0.05) ••• Leniencyct 0.02 (0.09) 0.22 (0.05) ••• ••• Demandct 0.61 (0.07) 0.24 (0.03) • Incumbentmct −0.03 (0.01) 0.00 (0.01) Strengthmct −0.01 (0.02) −0.01 (0.01) Spanmct −0.03 (0.02) 0.01 (0.01) ••• •• Complexityct 0.79 (0.02) 0.22 (0.06) ••• OwnProductsmct −0.08 (0.02) 0.02 (0.01) Complexity ct 0.04 •• (0.01) −0.01 (0.01) × OwnProductsmct Dispersion θ 2.82 5.48 Major dummies Included Included Genre dummies Included Included Year dummies Included Included No. observations 518 518 Deviance 1328.1 2590.2

Note: • p < 0.05; •• p < 0.01; ••• p < 0.001; std. errors in parentheses by independents (resp. by other majors). Their inclusion does not create a collinearity problem as the condition number of the model matrix is 6.23 for the MajorProducts model and 6.93 for the IndieProducts model after standardization. We expect the eects of OwnProducts and its interaction with Complexity to be stronger in the MajorProducts model. The results of these additional regressions are presented in Table 2.5 (Models 6–7). The estimates support our conjecture that deterrence mostly aects the other majors: in point of fact, it seems to aect only these com- panies because the coecient of OwnProducts is negative and signicant in Model 6 (βˆ = −0.079, p = 0.022) but not signicantly dierent from zero in Model 7 (βˆ = 0.017, p = 0.199). The parameters imply that, all else being 60 Categorization and Strategic Deterrence

equal, a one-std. dev. increase in the number of products released by the focal major in a particular genre is associated with a 7.6-percent decrease in the number of products released by the other majors in the same genre during the following year but with a negligible change in the number of 2 products released by independents. The interaction eect is also stronger in the MajorProducts model (βˆ = 0.045, p = 0.001): if Complexity simultane- ously increases by one std. dev., the net eect of a one-std. dev. increase in OwnProducts only amounts to a 3-percent decrease in the dependent variable. If, instead, Complexity decreases by one std. dev., then a one-std. dev. increase in OwnProducts lowers the expected value of MajorProducts by 12.2-percent. These results indicate a deterrent eect that is more than twice as strong than our main analysis suggested, albeit limited to a spe- cic group of rivals. Post-estimation diagnostics (available upon request) suggest that our estimates are consistent and safely interpretable. No discernible pattern appears in the plot of residuals and no observation has an excessive leverage in the regression.

2.5. Discussion This chapter examined the capacity of product proliferation to reduce com- petitive pressure by discouraging rival product introductions in a particular submarket or category, an eect Caves and Porter [6] referred to as the “contrived deterrence” of competition. Although previous studies in game theory have suggested that occupying (regions of) the feature space with one’s products has a substantive deterrent eect [9, 13, 24–26], empirical evidence of this relationship is scarce. Consequently, scholars came to question the notion that product proliferation is actually useful to ward o competitors [15]. Bridging research in strategic management and organiza- tional ecology, we proposed an original explanation for this inconsistency between game-theoretic predictions and empirical results. We argued that a product market cannot be regarded as a continuous surface but it rather consists of multiple subsets or submarkets, i.e., regions of the feature space, which represent product categories. These regions can be variably complex; some are internally homogeneous and therefore easy to saturate, and in these categories proliferation can be helpful to secure a dominant position. Other categories, however, are internally diverse, and because consumers are able to make ner-grained distinctions between products on oer the rms nd it harder to saturate the space. In this case, product proliferation tends to be a less eective deterrent, in the sense that the viability of rival product introductions is less strongly aected. Discussion 61

Based on this argument, we hypothesized that product proliferation indeed has a deterrent eect (Hypothesis 2.1), but that this is negatively moderated by the complexity of the category where the strategy is pursued (Hypothesis 2.2). Our empirical analysis of new product introductions in the US recording industry supports these two hypotheses: we found that 2 the proliferation strategies enacted by major record companies discourage product introductions in the same category by rival companies; however, this eect tends to disappear if the category presents greater variance in product attributes. In other words, the number of new products introduced in the same category by competitors does not change if the category is complex enough. Our additional analysis showed that deterrence primarily aects the other majors: hence, proliferation may only be a viable deterrent strategy against competitors who are suciently diversied to reposition themselves in feature space. It is not necessarily eective against rivals who are constrained to their current positions by specialization: for these rms, there may not be a real choice between ght and ight. These ndings, which allow us to accept our hypotheses, point to two possible explanations as to why the deterrent power of product prolifera- tion seems to be virtually inexistent in some markets. The rst is that the market in question may be too heterogeneous in terms of product features for proliferation to serve as a credible deterrent strategy. This is especially likely in technological markets during their emergent phase, such as the market for personal computer during the time period analyzed by Bayus and Putsis [15]. Indeed, the computer industry suered a veritable “complexity catastrophe” [79] during the Eighties and Nineties as the rms became more vertically integrated and increasingly favored the use of proprietary components. As a result, the variance in technical specications among the personal computer models available to consumers sharply increased. A similar rise in complexity can occur in non-emergent markets when they undergo important changes in regulatory frameworks. For example, pro- liferation may have temporarily ceased to be a viable deterrent strategy in the Spanish automobile market in after Spain’s entry into the European Economic Community, as a large number of foreign car models with dif- ferent technologies and designs became suddenly available to Spanish buyers [17]. An increase in complexity may have also rendered proliferation less eective in our very own empirical context, music recording, during the turbulent years that followed digitization [106]. In this case, it was not a shift in regulations that triggered the increase but rather a radical technological innovation, that is, digital distribution, which suddenly made available an unprecedented variety of products. 62 Categorization and Strategic Deterrence

The second explanation as to why proliferation strategies can have little deterrent power hinges on the ecological distinction between generalists and specialists [71]. Ecological theory suggests that only diversied or generalist rms can easily change their level of engagement across dierent 2 regions or categories, though this capacity tends to decrease with rm age [50, 125]. In the case of specialists, movements across market segments can be exceedingly hazardous because these rms are ill-equipped to handle change in the competitive environment [64]. As a consequence of this dierentail tolerance, specialists may be undeterred by product proliferation, and this is not because they have the resources to eectively counterattack but rather because they do not have any other choice. In markets where specialists account for the vast majority of new products, such as the craft beer industry [58, 126], one may nd no empirical evidence of strategic deterrence because only a relatively small number of rival product introductions is prevented. Beyond explaining variation in the deterrent power of proliferation strategies, our ndings contribute to the literature on product prolifera- tion by exploring the implications of product space complexity. Previous research [17] analyzed complexity for the moderating role it exerts on the relationship between proliferation and performance: we advance this line of research by theorizing and testing the inuence of complexity on the relationship between proliferation and rival product introductions. This eect is relevant to the strategic literature because creating barriers for competitors is among the primary reasons to pursue proliferation in the rst place [1]. We also add to extant literature by arguing that complexity should not be examined at the level of markets or industries as a whole, but rather as a property of individual submarkets or categories. This perspec- tive can be useful to generate new theory about competitive interactions at a subordinate level of analysis. In this regard, it is likely that complexity also aects the outcomes of other product strategies, such as diversication. Future research is needed to explore this possibility. Dening the complexity of space as a category property contributes to the growing literature on categories in organization theory. This stream of research, which by now has reached the proportions of a self-standing research domain [36], concerns itself with the identication of category- level sources of heterogeneity that aect rm- [e.g., 127] and product-level outcomes [e.g., 128]. A number of category properties have already been dened and extensively analyzed by organization theorists [cf. 75]: we believe space complexity deserves to be added to this list because, while its variance across categories is not properly captured by the properties References 63 examined before, it can aect many outcomes of interest to organization scholars, including the longevity of demand for particular kinds of oerings [27, 28], the competitive pressure experienced by organizations [115], and even the likelihood that new product categories emerge from existing ones by means of “subdivision” or “subtraction” [129, p. 389]. Our time-variant 2 treatment of complexity speaks to the importance of considering categories and their properties from a dynamic perspective. Our arguments are generalizable to other product markets that are suciently dierentiated for the boundaries of categories to be discernible to rms and consumers. This is normally the case in mature industries [83], and it certainly is the case in markets for cultural products [45]; however, it is not necessarily the case in nascent industries where the products are too innovative for meaningful distinctions between their characteristics to be fully established. Our study is limited in that we analyze competitive dynamics in an oligopolistic market and restrict our consideration to the strategies enacted by the dominant incumbents: our conclusions are thus maximally applicable to similar rms. In our empirical setting, these rms are the only ones who can eectively pursue a proliferation strategy due to the costs associated with releasing a greater number of products in a xed time frame; in other settings, however, the costs of introducing new products can be considerably lower, so that product proliferation may be likewise available to smaller rms. The question remains open of whether the outcomes experienced by these rms are comparable to those suggested by our empirical analysis.

2.6. References [1] A. V. Mainkar, M. Lubatkin, and W. S. Schulze, Toward a product- proliferation theory of entry barriers, Academy of Management Review 31, 1062 (2006). [2] J. M. Connor, Food product proliferation: A market structure analysis, American Journal of Agricultural Economics 63, 607 (1981). [3] V. Kadiyali, N. Vilcassim, and P. Chintagunta, Product line extensions and competitive market interactions: An empirical analysis, Journal of Econometrics 89, 339 (1998). [4] R. W. Shaw, Product proliferation in characteristics space: The U.K. fertiliser industry, Journal of Industrial Economics 31, 69 (1982). [5] M. J. Roberts and L. Samuelson, An empirical analysis of dynamic, nonprice competition in an oligopolistic industry, RAND Journal of Economics 19, 200 (1988). 64 Categorization and Strategic Deterrence

[6] R. E. Caves and M. E. Porter, From entry barriers to mobility barriers: Conjectural decisions and contrived deterrence to new competition, Quarterly Journal of Economics 91, 241 (1977). [7] S. C. Salop, Strategic entry deterrence, American Economic Review 69, 2 335 (1979). [8] K. J. Lancaster, The economics of product variety: A survey, Marketing Science 9, 189 (1990). [9] R. Schmalensee, Entry deterrence in the ready-to-eat breakfast cereal industry, Bell Journal of Economics 9, 305 (1978). [10] J. A. Brander and J. Eaton, Product line rivalry, American Economic Review 74, 323 (1984). [11] B. R. Nault and M. B. Vandenbosch, Eating your own lunch: Protection through preemption, Organization Science 7, 342 (1996). [12] J. W. Friedman, Oligopoly and the Theory of Games (North-Holland, Amsterdam, Netherlands, 1977). [13] G. Bonanno, Location choice, product proliferation, and entry deter- rence, Review of Economic Studies 54, 37 (1987). [14] K. Ramdas, Managing product variety: An integrative review and re- search directions, Production and Operations Management 12, 79 (2003). [15] B. L. Bayus and W. P. Putsis, Product proliferation: An empirical analysis of product line determinants and market outcomes, Marketing Science 18, 137 (1999). [16] M. Bertini, L. Wathieu, and S. S. Iyengar, The discriminating consumer: Product proliferation and willingness to pay for quality, Journal of Marketing Research 49, 39 (2012). [17] A. Barroso and M. S. Giarratana, Product proliferation strategies and rm performance: The moderating role of product space complexity, Strategic Management Journal 34, 1435 (2013). [18] W. P. Barnett and J. Freeman, Too much of a good thing? Product proliferation and organizational failure, Organization Science 12, 539 (2001). [19] W. Amaldoss and S. Jain, Reference groups and product line decisions: An experimental investigation of limited editions and product prolif- eration, Management Science 56, 621 (2010). [20] S. Kekre and K. Srinivasan, Broader product line: A necessity to achieve success? Management Science 36, 1216 (1990). [21] M. S. Giarratana and A. Fosfuri, Product strategies and survival in Schumpeterian environments: Evidence from the security software industry, Organization Studies 28, 909 (2007). References 65

[22] Y. M. Zhou and X. Wan, Product variety, sourcing complexity, and the bottleneck of coordination, Strategic Management Journal 38, 1569 (2016). [23] J. P. Eggers, All experience is not created equal: Learning, adapting, and focusing in product portfolio management, Strategic Management 2 Journal 33, 315 (2012). [24] R. S. Raubitschek, A model of product proliferation with multiproduct rms, Journal of Industrial Economics 35, 269 (1987). [25] S. Bhatt, Strategic product choice in dierentiated markets, Journal of Industrial Economics 36, 207 (1987). [26] R. J. Gilbert and C. Matutes, Product line rivalry with brand dierentia- tion, Journal of Industrial Economics 41, 223 (1993). [27] O. M. Khessina and G. R. Carroll, Product demography of de novo and de alio rms in the optical disk drive industry, 1983–1999, Organization Science 19, 25 (2008). [28] A. Barroso, M. S. Giarratana, S. Reis, and O. J. Sorenson, Crowding, satiation, and saturation: The days of television series’ lives, Strategic Management Journal 37, 565 (2016). [29] G. Carnabuci, E. Operti, and B. Kovács, The categorical imperative and structural reproduction: Dynamics of technological entry in the semiconductor industry, Organization Science 26, 1734 (2015). [30] B. Kovács and M. T. Hannan, The consequences of category spanning depend on contrast, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 175–201. [31] B. Kovács and R. Johnson, Contrasting alternative explanations for the consequences of category spanning: A study of restaurant reviews and menus in San Francisco, Strategic Organization 12, 7 (2014). [32] G. Negro, M. T. Hannan, and H. Rao, Categorical contrast and audience appeal: Niche width and critical success in winemaking, Industrial and Corporate Change 19, 1397 (2010). [33] B. Levitt and J. G. March, Organizational learning, Annual Review of Sociology 14, 319 (1988). [34] P. S. McCarthy, P. K. Kannan, R. Chandrasekharan, and G. P. Wright, Estimating loyalty and switching with an application to the automobile market, Management Science 38, 1371 (1992). [35] G. Punj and J. Moon, Positioning options for achieving brand associ- ation: A psychology categorization framework, Journal of Business Research 55, 275 (2002). 66 Categorization and Strategic Deterrence

[36] J.-P. Vergne and T. Wry, Categorizing categorization research: Review, integration, and future directions, Journal of Management Studies 51, 56 (2014). [37] A. Karnani and B. Wernerfelt, Multiple point competition, Strategic 2 Management Journal 6, 87 (1985). [38] O. J. Sorenson, Letting the market work for you: An evolutionary per- spective on product strategy, Strategic Management Journal 21, 577 (2000). [39] P. D. Lopes, Innovation and diversity in the popular music industry, 1969 to 1990, American Sociological Review 57, 56 (1992). [40] R. A. Peterson and D. G. Berger, Measuring industry concentration, diversity, and innovation in popular music, American Sociological Review 61, 175 (1996). [41] G. Tzanetakis and P. R. Cook, Musical genre classication of audio signals, IEEE Transactions on Speech and Audio Processing 10, 293 (2002). [42] M. Huygens, F. A. J. van den Bosch, H. W. Volberda, and C. Baden-Fuller, Co-evolution of rm capabilities and industry competition: Investigat- ing the music industry, 1877-1997, Organization Studies 22, 971 (2001). [43] M. J. Benner and J. Waldfogel, The song remains the same? Technolog- ical change and positioning in the recorded music industry, Strategy Science 1, 129 (2016). [44] Billboard, US recording industry 2015: Streams double, Adele dom- inates, https://www.billboard.com/articles/business/6835216/us- recording-industry-2015-streams-double-adele-dominates- nielsen-music (2016), accessed April 6, 2017. [45] A. Goldberg, M. T. Hannan, and B. Kovács, What does it mean to span cultural boundaries? Variety and atypicality in cultural consumption, American Sociological Review 81, 215 (2016). [46] H. Hotelling, Stability in competition, Economic Journal 39, 41 (1929). [47] K. J. Lancaster, A new approach to consumer theory, Journal of Political Economy 74, 132 (1966). [48] K. J. Lancaster, Consumer Demand: A New Approach (Columbia Univer- sity Press, New York, NY, 1971). [49] K. J. Lancaster, Variety, Equity, and Eciency: Product Variety in an Industrial Society (Columbia University Press, New York City, NY, 1979). [50] G. Le Mens, M. T. Hannan, and L. Pólos, Organizational obsolescence, drifting tastes, and age dependence in organizational life chances, Organization Science 26, 550 (2015). [51] R. E. Kennedy, Strategy fads and competitive convergence: An empirical References 67

test for herd behavior in prime-time television programming, Journal of Industrial Economics 50, 57 (2002). [52] A. Okabe, B. Boots, K. Sugihara, and S. N. Chiu, Spatial Tessellations: Concepts and Applications of Voronoi Diagrams (John Wiley & Sons, Hoboken, NJ, 1992). 2 [53] R. F. Lanzillotti, Multiple products and oligopoly strategy: A devel- opment of Chamberlain’s theory of products, Quarterly Journal of Economics 68, 461 (1954). [54] A. van Witteloostuijn and C. Boone, A resource-based theory of market structure and organizational form, Academy of Management Review 31, 409 (2006). [55] C. Boone, A. van Witteloostuijn, and G. R. Carroll, Resource distribu- tion and market partitioning: Dutch daily newspapers, 1968 to 1994, American Sociological Review 67, 408 (2002). [56] G. R. Carroll, Concentration and specialization: Dynamics of niche width in populations of organizations, American Journal of Sociology 90, 1262 (1985). [57] G. L. Péli and B. Nooteboom, Market partitioning and the geometry of the resource space, American Journal of Sociology 104, 1132 (1999). [58] G. R. Carroll and A. Swaminathan, Why the microbrewery movement? Organizational dynamics of resource partitioning in the U.S. brewing industry, American Journal of Sociology 106, 715 (2000). [59] B. Wernerfelt and A. Karnani, Competitive strategy under uncertainty, Strategic Management Journal 8, 187 (1987). [60] G. Le Mens, M. T. Hannan, and L. Pólos, Founding conditions, learning, and organizational life chances: Age dependence revisited, Adminis- trative Science Quarterly 56, 95 (2011). [61] D. A. Levinthal, Adaptation on rugged landscapes, Management Sci- ence 43, 934 (1997). [62] A. Fiegenbaum and H. Thomas, Strategic groups as reference groups: Theory, modeling, and empirical examination of industry and compet- itive strategy, Strategic Management Journal 16, 461 (1995). [63] J. F. Porac, H. Thomas, and C. Baden-Fuller, Competitive groups as cognitive communities: The case of Scottish knitwear manufacturers revisited, Journal of Management Studies 48, 646 (2011). [64] S. D. Dobrev, T.-Y. Kim, and M. T. Hannan, Dynamics of niche width and resource partitioning, American Journal of Sociology 106, 1299 (2001). [65] J. F. Porac and H. Thomas, Taxonomic mental models in competitor denition, Academy of Management Review 15, 224 (1990). [66] A. Fiegenbaum, S. Hart, and D. Schendel, Strategic reference point 68 Categorization and Strategic Deterrence

theory, Strategic Management Journal 17, 219 (1996). [67] N. P. Mark, Birds of a feather sing together, Social Forces 77, 453 (1998). [68] N. P. Mark, Culture and competition: Homophily and distancing ex- planations for cultural niches, American Sociological Review 68, 319 2 (2003). [69] J. Sutton, Technology and Market Structure: Theory and History (MIT Press, Cambridge, MA, 1998). [70] S. Klepper and P. Thompson, Submarkets and the evolution of market structure, RAND Journal of Economics 37, 861 (2006). [71] M. T. Hannan and J. Freeman, The population ecology of organizations, American Journal of Sociology 82, 929 (1977). [72] S. D. Dobrev, T.-Y. Kim, and G. R. Carroll, The evolution of organizational niches: U.S. automobile manufacturers, 1885–1981, Administrative Sci- ence Quarterly 47, 233 (2002). [73] M. T. Hannan, G. R. Carroll, and L. Pólos, The organizational niche, Sociological Theory 21, 309 (2003). [74] E. G. Pontikes, Two sides of the same coin: How ambiguous classica- tion aects multiple audiences’ evaluations, Administrative Science Quarterly 57, 81 (2012). [75] M. T. Hannan, Partiality of memberships in categories and audiences, Annual Review of Sociology 36, 159 (2010). [76] G. R. Carroll, L. S. Bigelow, M.-D. L. Seidel, and L. B. Tsai, The fates of de novo and de alio producers in the American automobile industry, 1885–1981, Strategic Management Journal 17, 117 (1996). [77] A. P. Jacquemin and C. H. Berry, Entropy measure of diversication and corporate growth, Journal of Industrial Economics 27, 359 (1979). [78] J. W. Rivkin, Imitation of complex strategies, Management Science 46, 824 (2000). [79] O. J. Sorenson, The Complexity Catastrophe in the Computer Industry: Interdependence and Adaptability in Organizational Evolution, Ph.D. thesis, Stanford University (1997). [80] H. A. Simon, A behavioral model of rational choice, Quarterly Journal of Economics 69, 99 (1955). [81] S. M. Shugan, The cost of thinking, Journal of Consumer Research 7, 99 (1980). [82] E. H. Rosch, Principles of categorization, in Cognition and Catego- rization, edited by E. H. Rosch and B. B. Lloyd (Lawrence Erlbaum Associates, Hillsdale, NJ, 1978) pp. 27–48. [83] M. Ruef and K. Patterson, Credit and classication: The impact of industry boundaries in nineteenth-century America, Administrative References 69

Science Quarterly 54, 486 (2009). [84] M. A. Glynn and M. Lounsbury, From the critics’ corner: Logic blending, discursive change, and authenticity in a cultural production system, Journal of Management Studies 42, 1031 (2005). [85] G. Hsu, P. W. Roberts, and A. Swaminathan, Evaluative schemas and 2 the mediating role of critics, Organization Science 23, 83 (2012). [86] M. Khaire, The importance of being independent: The role of inter- mediaries in creating market categories, in From Categories to Cate- gorization: Studies in Sociology, Organizations, and Strategy at the Crossroads, Research in the Sociology of Organizations, Vol. 51, edited by R. Durand, N. Granqvist, and A. Tyllström (Emerald Group, Bingley, United Kingdom, 2017) pp. 259–293. [87] G. Negro and M. D. Leung, “Actual” and perceptual eects of category spanning, Organization Science 24, 684 (2013). [88] J.-P. Ferguson and G. Carnabuci, Risky recombinations: Institutional gatekeeping in the innovation process, Organization Science 28, 133 (2017). [89] E. W. Zuckerman, Focusing the corporate product: Securities analysts and de-diversication, Administrative Science Quarterly 45, 591 (2000). [90] R. P. Rumelt, Diversication strategy and protability, Strategic Man- agement Journal 4, 359 (1982). [91] S. D. Dobrev, Competing in the looking-glass market: Imitation, re- sources, and crowding, Strategic Management Journal 28, 1267 (2007). [92] G. Cattani, J. F. Porac, and H. Thomas, Categories and competition, Strategic Management Journal 38, 64 (2017). [93] J. A. Rosa, J. F. Porac, J. Runser-Spanjol, and M. S. Saxon, Sociocognitive dynamics in a product market, Journal of Marketing 63, 64 (1999). [94] J. R. Hauser, O. Toubia, T. Evgeniou, R. Befurt, and D. Dzyabura, Dis- junctions of conjunctions, cognitive simplicity, and consideration sets, Journal of Marketing Research 47, 485 (2010). [95] M. G. Devetag, From utilities to mental models: A critical survey on deci- sion rules and cognition in consumer choice, Industrial and Corporate Change 8, 289 (1999). [96] Ö. Koçak, M. T. Hannan, and G. Hsu, Emergence of market orders: Audience interaction and vanguard inuence, Organization Studies 35, 765 (2014). [97] J. Jacoby, D. E. Speller, and C. A. Kohn, Brand choice behavior as a function of information load, Journal of Marketing Research 11, 63 (1974). [98] J. G. Helgeson and M. L. Ursic, Information load, cost/benet assess- 70 Categorization and Strategic Deterrence

ment, and decision strategy variability, Journal of the Academy of Marketing Science 21, 13 (1993). [99] P. J. Lenk, W. S. DeSarbo, P. E. Green, and M. R. Young, Hierarchical Bayes conjoint analysis: Recovery of partworth heterogeneity from 2 reduced experimental designs, Marketing Science 15, 173 (1996). [100] P. J. Alexander, Entropy and popular culture: Product diversity in the popular music recording industry, American Sociological Review 61, 171 (1996). [101] S. Verheyen, E. Ameel, and G. Storms, Determining the dimensionality in spatial representations of semantic concepts, Behavior Research Methods 39, 427 (2007). [102] J. M. Mol and N. M. Wijnberg, Competition, selection, and rock and roll: The economics of payola and authenticity, Journal of Economic Issues 41, 701 (2007). [103] N. Anand and R. A. Peterson, When market information constitutes elds: Sensemaking of markets in the commercial music industry, Organization Science 11, 270 (2000). [104] R. Karshner, The Music Machine: What Really Goes on in the Music Industry (Nash Pub, Los Angeles, CA, 1971). [105] J. M. Mol, N. M. Wijnberg, and C. Carroll, Value chain envy: Explain- ing new entry and vertical integration in popular music, Journal of Management Studies 42, 251 (2005). [106] S. Bhattacharjee, R. D. Gopal, K. Lertwachara, J. R. Marsden, and R. Telang, The eect of digital sharing technologies on music markets: A survival analysis of albums on ranking charts, Management Science 53, 1359 (2007). [107] Digital Music News, Two-thirds of all music sold comes from just three companies, https://www.digitalmusicnews.com/2016/08/03/ two-thirds-music-sales-come-three-major-labels (2016), accessed April 6, 2017. [108] M. J. Salganik, P. S. Dodds, and D. J. Watts, Experimental study of inequality and unpredictability in an articial cultural market, Science 311, 854 (2006). [109] E. T. Bradlow and P. S. Fader, A Bayesian lifetime model for the “Hot 100” Billboard songs, Journal of the American Statistical Association 96, 368 (2001). [110] Billboard, The economy of mixtapes: How Drake, Wiz Khalifa, Big KRIT gured it out, https://www.billboard.com/biz/articles/news/ 1168371/the-economy-of-mixtapes-how-drake-wiz-khalifa-big-krit- gured-it-out (2011), accessed 6 April 2017. References 71

[111] J. C. Lena and R. A. Peterson, Classication as culture: Types and trajectories of music genres, American Sociological Review 73, 697 (2008). [112] BuzzFeed Music, What it’s like when a label won’t release your al- bum, https://www.buzzfeed.com/azafar/what-happens-when-your- 2 favorite-artist-is-legally-unable-to (2013), accessed 6 April 2017. [113] J. Lee, P. Boatwright, and W. A. Kamakura, A Bayesian model for prelaunch sales forecasting of recorded music, Management Science 49, 179 (2003). [114] N. Askin and M. Mauskapf, What makes popular music popular? Prod- uct features and optimal dierentiation in music, American Sociologi- cal Review 82, 910 (2017). [115] M. Montauti and F. C. Wezel, Charting the territory: Recombination as a source of uncertainty for potential entrants, Organization Science 27, 954 (2016). [116] J. B. Kruskal, Nonmetric multidimensional scaling: A numerical method, Psychometrika 29, 115 (1964). [117] I. F. Spellerberg and P. J. Fedor, A tribute to Claude Shannon (1916–2001) and a plea for more rigorous use of species richness, species diversity, and the “Shannon-Wiener” index, Global Ecology and Biogeography 12, 177 (2003). [118] E. G. Pontikes and W. P. Barnett, The persistence of lenient market categories, Organization Science 26, 1415 (2015). [119] B. Kovács and M. T. Hannan, Conceptual spaces and the consequences of category spanning, Sociological Science 2, 252 (2015). [120] R. B. O’Hara and D. J. Kotze, Do not log-transform count data, Methods in Ecology and Evolution 1, 118 (2010). [121] A. C. Cameron and P. K. Trivedi, Regression Analysis of Count Data (Cambridge University Press, New York, NY, 1998). [122] R. W. M. Wedderburn, Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method, Biometrika 61, 439 (1974). [123] J. M. Ver Hoef and P. L. Boveng, Quasi-Poisson vs. negative binomial regression: How should we model overdispersed count data? Ecology 88, 2766 (2007). [124] D. A. Belsley, E. Kuh, and R. E. Welsch, Regression Diagnostics: Identi- fying Inuential Data and Sources of Collinearity (John Wiley & Sons, Hoboken, NJ, 1980). [125] G. Le Mens, M. T. Hannan, and L. Pólos, Age-related structural inertia: A distance-based approach, Organization Science 26, 756 (2015). [126] J. C. Verhaal, O. M. Khessina, and S. D. Dobrev, Oppositional product 72 Categorization and Strategic Deterrence

names, organizational identities, and product appeal, Organization Science 26, 1466 (2015). [127] J. G. Kuilman and F. C. Wezel, Taking o: Category contrast and or- ganizational mortality in the U.K. airline industry, 1919–64, Strategic 2 Organization 11, 56 (2013). [128] G. Negro, M. T. Hannan, and M. Fassiotto, Category signaling and reputation, Organization Science 26, 584 (2015). [129] M. T. Kennedy, Y.-C. Lo, and M. Lounsbury, Category currency: The changing value of conformity as a function of ongoing meaning con- struction, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 369–397. 3 Prototypes, Goals, and Cross-Classification

This chapter is based on a research paper written in collaboration with Nachoem M. Wijnberg. Earlier versions of this paper were presented at the 2016 Annual Meeting of the Academy of Management (Anaheim, CA), the 2016 Colloquium of the European Group for Organizational Studies (Federico II University of Naples), the 2017 Workshop on Logic and Applications (University of Johannesburg), and the 2017 Organizational Ecology Workshop (Carlos III University of Madrid). We are very grateful to Giacomo Negro and Michael T. Hannan for extensive comments and insightful discussions.

73 74 Prototypes, Goals, and Cross-Classification

3.1. Motivation Classication systems are ordered structures that organize cognitive do- mains by sorting objects into relatively homogeneous groups, or categories [1]. Rational agents in a market defer to these systems when making deci- sions about products and organizations: in some contexts, the categories they use are institutionalized and formally enforced [2], whereas in others they arise informally from communication and discourse [3, 4]. Either way, 3 the category labels assigned to an object aect consumers’ perception of value because they encode information that is relevant to economic decision-making and costly to obtain otherwise. This information can be especially critical in markets characterized by uncertainty, such as the cre- ative [5] or technological industries [6], where consumers may be forced to choose among products with partly obscure characteristics [cf. 7]. In these circumstances, categories ease economic decisions by inducing ex- pectations and default beliefs [8]: in doing so, they reduce information asymmetries and enable appropriate comparisons. Sociologists showed that consumers heavily rely on classication sys- tems when predicting if a product ts their taste [9], and because consumers are generally risk-averse, they tend to devalue objects with ambiguous cat- egory labels [4, 10]. This normally occurs when a product is assigned to multiple categories, as dierent labels convey partial or conicting infor- mation [11]. Precisely for this reason, products that straddle the boundaries of categories have been found to receive lower evaluations by audiences regardless of their actual quality [12, 13]. This “multiple-category discount” is consistent with cognitive-psychological insights on categorization, ac- cording to which categories coalesce around specic reference points in a cognitive space [14]. In the eyes of consumers, the objects that resemble these points are familiar and easy to process [cf. 15]; deviant products, in- stead, tend to be cognitively taxing. In accord with economic theory [16], the uncertainty that consumers incur when attempting to derive information about multi-categorical products engenders a value discount. This negative perspective on category spanning has long dominated organizational research. From the markets for creative goods [17] to those for technology [18], labor [19, 20] and capital [21], rms have been advised to focus themselves and their oerings in order to avoid being perceived as mists. Yet many studies oered examples of products, organizations, or job candidates that span category boundaries—and succeed [22–25]. How can this evidence be reconciled with a constraining view of category labels? To answer this question, some scholars invoked the moderating role of producer-level constructs, such as status [26], identity [27], and Motivation 75 tenure in the industry [20]. Others appealed to category properties like contrast [28], leniency [4], and similarity [29]. In this chapter, we examine an alternative explanation, namely that dierent classication systems can be simultaneously relevant to evaluators, so that products belonging to multiple categories according to one system can appear less ambiguous as a result of the information one derives from another. Such an instance of cross-classication [30] in multiple category sys- tems can easily occur in markets where products are categorized according 3 to the goals they fulll as well as to the prototypes they resemble [31]. For example, foods are normally classied in a type-based system that includes grains, meats, fruit, dairy, and vegetables, but they can also be categorized in a system that comprises such labels as functional foods [32], foods to eat while driving [33], and foods to serve at a party [34]. Similar considerations apply to the categorization of rms: high-technology ventures, for example, are commonly sorted by investors based on the type of inventions they patent, but also (and perhaps more importantly) based on whether they pursue commercial product applications vs. funda- mental science [35]. Consistency within a goal-based classication system can aect the evaluations received by category spanners: investors who are interested in high-tech start-ups, for instance, may not care that the technology is atypical as long as it can lead to marketable products. Unlike categories based on prototypes, categories based on goals are seldom institutionalized and can be dicult to observe empirically. Unsur- prisingly, most studies on category spanning so far only considered catego- rization from a prototype-centered view, which is not always adequate [36]. Organization scholars occasionally considered goal-based categories, but the few extant studies that reect this broader perspective are limited in that they only examined evaluations by specic audience members who could be assumed to have particular goals [23, 24]. In this chapter, we adopt a more general approach and analyze product evaluations in a setting where both type- and goal-based categories are explicit and observable. This allows us to examine how products’ positioning according to distinct classication systems aects their evaluations by a heterogeneous audi- ence. Our analysis contributes to the organizational literature by showing that these dierent systems jointly contribute to sensemaking and to the formulation of a value order. In particular, we show that multi-categorical products can be easier to make sense of and receive higher evaluations either when goal-based category labels indicate goal-consistency, or when the products span such a large number of type-based categories that only a few combinations of features are possible. 76 Prototypes, Goals, and Cross-Classification

To test our predictions, we analyze product ratings on AllMusic.com, a popular online platform where consumers can browse, sample, and evaluate thousands of music records. In this context, the type-based classication system corresponds to a well-established genre taxonomy where each membership implies conformity to a particular aesthetic canon [37]. The goal-based system, instead, revolves around the emotions or moods evoked by the music [38]. Mood categories have become increasingly important in 3 the music business after digitization [39–41], as consumers increasingly use them to search for new products [42] and streaming services use them to provide goal-directed playlists [43]. On AllMusic, each record can span genre and mood categories independently: some products, like Morrissey’s You Are the Quarry, have only two genre labels (pop/rock and R&B) but a large number of moods (including humorous, gloomy, intimate, and bittersweet); others, like Deep Forest’s Boheme, have only one mood (hypnotic) but span several genres (electronic, pop/rock, and new age). We exploit this variance to isolate the eects of spanning in one or both systems. This chapter is structed as follows: In Section 3.2, we review the cognitive- psychological literature that underpins our theory and present a geometric model of consumers’ cognition. We explain how type-based categories and goal-based categories map to structurally dierent regions of this cognitive space, and why such dierent internal structures lead to dierent eects on evaluations for products with multiple labels. In addition, we theorize the eects of spanning type- and goal-based categories simultaneously. Section 3.3 presents the setting of our observational study, as well as our data and statistical methods. Section 3.4 reports our empirical ndings, and Section 3.5 discusses their implications for organizational research.

3.2. Theory and Hypotheses 3.2.1. The Feature Space Although from a psychological viewpoint categories aect decision-making by ltering information and helping agents prevent an overwhelming bar- rage of stimuli [1], in economic contexts they often serve a dierent purpose, namely compensating for the paucity of information by encoding facts about products that would be costly to obtain through dierent means [e.g., 6]. By virtue of this role, organization theorists termed categories the “cognitive infrastructures” of markets [44, p. 255]: this denition is appropri- ate not only because it captures the profound inuence of categorization on human perception [45], but also because it (correctly) suggests that category labels hardly exist in a vacuum; instead, they arise in broader, Theory and Hypotheses 77 ordered structures, which we refer to as classication systems. Examples of such systems that are familiar to organization scholars include the R. G. Dun rating schema [2], the USPTO patent classes [18, 35, 46], the varietal cat- egories of wines [47, 48] and cuisines [49], feature lm genres [17, 20, 22, 27], and standard industrial codes [50]. Objects that belong to more than one category within the same classication system, as in the case of patents led under dierent USPTO classes [18] or movies assigned by critics to dierent genres [22], are termed category spanners. 3 Understanding the eects of category spanning is key to fully appraising the role that categories play in the ordering of markets. In a concerted eort to clarify these eects, previous studies analyzed the relationship between category spanning and consumers’ evaluations in a variety of empirical setting, reporting consistent evidence of a negative relationship [11]. Early studies explained this phenomenon by invoking the ecological principle of allocation: because producers have nite resources, trying to appeal to dierent audience groups causes them to have a lower t with the tastes of each group on average [17]. Subsequent research rened this explanation by acknowledging that perceptual eects are simultaneously at play [10, 12, 13]: from this angle, multiple category labels negatively aect evaluations because they induce ambiguity and increase audiences’ risk of forming erroneous expectations about products. If the audience members are risk-averse, as is normally the case of consumers [4], they are likely to react to this uncertainty by ascribing lower value to the products [16]. This perspective is consistent with seminal research on categories in cognitive psychology, according to which category labels function as cues that people can interpret to retrieve and possibly combine [51–53] informa- tion previously stored in their minds. Building directly on prototype theory [54] and the related literature on conceptual spaces [55, 56], sociologists recently proposed a geometric model of the market as a multidimensional space where each category maps to a particular region and where pro- totypes act as shared reference points [9]. The location of products in this cognitive space depends on their features, of which category labels can oer an inkling [57]. If there is only one (type-based) category label, meaning that the product is highly typical of a certain category, consumers can easily surmise its location thanks to their familiarity with the category prototype. The presence of multiple labels, however, tends to undermine this process by signaling that the product’s position is intermediate [29] and thus harder to predict [58]. This can vex evaluators, who need clear information about products’ features to estimate their worth. Clear-cut memberships help making these features known to consumers ex ante. 78 Prototypes, Goals, and Cross-Classification

This model of the market as a space of product features has illustrious precedents in economics [59–63] and can be considered akin to a Lancas- trian or locational-analog model of demand (cf. Chapter2). In this chapter, we aim to address one of the current limitations of this geometric approach: inasmuch as this model is only used to conceptualize categories that re- volve around prototypes, it can misrepresent real-world domains where other classication systems exist and are simultaneously relevant to con- 3 sumers [30]. In fact, the limitations of prototype theory and the relevance of categories that do not possess a prototypical structure have long been acknowledged in psychology [36, 64] and consumer research [65]: here, we consider the implications of categories based on consumers’ goals, which are common in many product markets [31, 32] but do not necessarily hinge on prototypical representations [66]. While type-based categories encode the products’ conformity to some cognitive reference point [14]—in other words, they signal proximity to a prototype in the feature space—goal-based categories express suitability for particular purposes and cannot be used to infer a distance. In point of fact, such categories dier from their type-based counterparts precisely because central tendency is not a good predictor of category membership [67]: the same label can be equally applicable to very dissimilar (i.e., distant) objects [68]. This major distinction notwithstanding, goal-based categories encode information that helps consumers determine a product’s utility [33], and because consumer behavior is often goal-directed [69], their impact on product evaluations is likely to be vast. At present, this impact is largely un- accounted for in organizational theories. We ll this gap in extant research by extending the spatial model above so as to account for goal-based categories in a geometric fashion. We use this amended model to derive testable predictions about the eects of spanning categories in a type- based classication system, a goal-based classication system, or both. Moreover, we highlight this model gives rise to an original parsimonious explanation as to why products that span type-based categories can receive better evaluations by audiences, all else being equal. Our arguments pivot on the elementary notion, widely acknowledged in economics since the work of Herbert Simon [70], that “anticipating future consequences of present decisions is often subject to substantial error” [71, p. 589]. In our case, the future consequences relate to the utility derived from a product that consumers know little about before consumption [72]. One of the main reasons why category labels inuence economic behavior is that they allow agents in a market to make better predictions vis-à-vis a product’s value: in the spatial model we co-opt from sociological research, Theory and Hypotheses 79 this prediction involves pinpointing the product’s location in feature space with reasonable accuracy. Compatibly with Akerlof’s classic argument about uncertainty in markets [16], some of the variance in consumers’ disposi- tion toward products depends on how easily they manage to solve this informational problem [6]: If the product has a single type-based category label, meaning that it closely resembles a specic prototype, the solu- tion to this problem is relatively straightforward because prototypes are already familiar to consumers. Yet, under particular circumstances it is 3 possible to derive the product’s location with reasonable accuracy even if the product has multiple type-based category labels, which may give rise to a nonmonotonic eect on evaluations. This eect is impossible in the case of goal-based categories because of fundamental dierences in these categories’ internal structures [cf. 67]. The objective of our study is to provide empirical evidence of this structural dierence. In the next section, we detail the cognitive mechanism through which consumers can infer a product’s features by combining information from multiple prototypes. While this mechanism is justied by the geometric nature of the feature space, it requires the following assumptions about evaluators’ knowledge and behavior: (a) Evaluators must know at least some of the prototypes associated with the products they evaluate. Notice that they are not required to know all the prototypes in the domain at hand, nor are they required to know all the prototypes associated with the product they consider. It suces to know only some. (b) Evaluators must able to tell how far a product can be from a prototype and still be considered a member of the corresponding type-based category. In other words, they must know the category’s contrast [11]. (c) Evaluators must be able and willing to combine information from multiple categories in order to make better inferences about products’ features. To this purpose, they must not resort to simpler strategies, like defaulting to the most salient category label. Assumptions (a) and (b) imply a minimal level of familiarity with the cog- nitive domain: in our case, the market where the products belong. In this sense, the mechanism we propose is only available to an audience that knows at least some of the categories. Assumption (c) implies that evalua- tors engage in multiple-category induction [73]. Extant research in cognitive psychology suggests that this assumption is reasonable, especially when the category labels are listed explicitly [74, 75]. 80 Prototypes, Goals, and Cross-Classification

It is worth remarking explicitly that the primary objective of this chapter is not to identify an empirical relationship that occurs in settings where these assumptions hold true. The assumptions above are restrictive and do not hold in most markets. Our aim is rather to oer empirical proof of a key distinction between type- and goal-based categories, which is expected to manifest itself through a dierential eect on consumers’ evaluations under the aforementioned conditions. The violation of these assumptions does 3 not imply that the distinction wanes or that its implications for evaluators’ uncertainty become negligible; only that there may not be a similar eect on the value consumers attribute to products.

3.2.2. Atypicality and its Consequences We rst consider the consequences of spanning categories that encode family resemblance. Consistently with previous research [e.g., 9], we charac- terize the market as a D-dimensional space where D is the total number of features f1 ... fD that distinguish the products in the eyes of consumers. The value of D represents the space’s dimensionality. Each product occu- pies a point or position in this feature space, which is uniquely identied ì by a vector f = hf1 ... fD i, termed feature prole [57]. Consumers do not necessarily know the feature proles of the products they evaluate, but they are required to infer them with reasonable accuracy in order to compute the products’ distance from their tastes [63]. For the sake of illustration, consider a simple market where products are distinguishable by as little as four features, as in the case of balloons that dier by the color of four quadrants on their surface (D = 4). Without loss of generality, suppose that the color of each quadrant can be either blue (f = 0) or orange (f = 1), thus giving rise to a space of 16 positions, as presented in Figure 3.1. Each node in this graph corresponds to a unique feature prole, and the edges connect proles that dier by only one value, meaning that they occupy adjacent positions in the feature space. Suppose that orange balloons were a relevant type-based category in ì this context, with prototype f1 = h1 1 1 1i. To emphasize the centrality of this prototype in category generation, we refer to orange balloons in notation ì as type(f1). If products were required to have exactly four orange quadrants ì in order to be considered members of type(f1), then the category would have very high contrast, meaning that it would be easy to tell whether a ì product belonged to the category or not [11]. If the members of type(f1) were allowed to have one blue quadrant, instead, contrast would be lower ì and the category would also encompass f2–5. As a result, its boundaries would be fuzzier [cf. 17] and membership would no longer be clear-cut. Theory and Hypotheses 81

f~1

f~2 f~3 f~4 f~5

3 f~6 f~7 f~8 f~9 f~10 f~11

f~12 f~13 f~14 f~15

f~16

Figure 3.1: A four-dimensional space of binary features

Irrespective of contrast, the positions encompassed by a type-based category are always clustered in a convex region of the feature space [76]. Convexity is a geometric property implying that, for any two non-adjacent points that belong to the category, there are positions in-between these two points that also belong to the category. In a Euclidean space, for exam- ple, shapes like circles, squares, and triangles are convex, but those that have indents or holes such as stars, lunes, and annuli are not. In practice, convexity implies that consumers can trace the location of any category member to a subset of adjacent and therefore similar feature proles. The geometric mean of these positions (i.e., the center of the region) corre- sponds to the category prototype. Familiarity with this geometric center is the reason why consumers can derive information about products with a single category membership with relative ease. This process is all the more accurate if the category has higher contrast, because high-contrast categories tolerate smaller deviations from their centroids. As a result, products (and organizations) in high-contrast categories tend to receive better evaluations by their audiences [77]. The risk of misjudging a product’s position can be higher if the product has multiple type-based category labels. This is because consumers infer 82 Prototypes, Goals, and Cross-Classification

that the product imperfectly resembles several prototype; however, they do not know the extent to which it resembles each—and even if they did, there could be many positions in the feature space that are equally consistent with this information. For example, suppose that a second type-based ì category existed in our market for balloons, with prototype f15 = h0 1 0 0i. ì ì If consumers knew that a product belonged to type(f1) and type(f15), they would not be able to locate its position with the same accuracy as the pre- vious scenario because multiple proles are compatible with these partial 3 ì ì cues. Indeed, f3–5 and f9–11 are all located between the two prototypes, hence they are equally plausible in the eyes of a rational agent [75]. Intu- ì ì itively, this means that the features associated with type(f1) and type(f15) can be combined in multiple ways that are equally compatible with both category labels, but in lack of any additional information, the evaluator is unable to tell which combination is correct. In this sense, our model agrees with extant research in organization theory [e.g., 10]: spanning type-based categories renders products’ features ambiguous. Our model diverges from conventional wisdom, however, in that this uncertainty-increasing eect is expected to reverse if the number of type- based category labels assigned to the product is suciently large. This is because more and more labels make fewer and fewer options appear plausi- ble in the eyes of consumers. For example, suppose that a third type-based ì category existed in our ctitious market, with prototype f6 = h1 0 0 1i. If ì ì consumers knew that the product in question belonged to type(f1), type(f15), ì and type(f6), the range of possible solutions would be restricted to only one ì ì feature prole, namely f3, because this is not only located between f1 and ì ì f15, but also closer to f6. The process whereby consumers selectively com- bine the information from dierent category labels follows the set-theoretic rules that Hampton [51] referred to as the “necessity” and “impossibility” of feature inheritance. These imply that a product belonging to multiple cate- gories must have some of the features of each category prototype, but this combination cannot be arbitrary because some of the features are incom- patible. Given a sucient amount of prototypes, consumers may eventually obtain a unique solution to this informational problem. As the number of prototypes (and hence of type-based category labels) approaches this critical threshold, evaluators’ uncertainty about the features a product actually possesses should decrease. From a geometric standpoint, this cognitive mechanism is comparable to the geometric process of trilateration of a variable point on a Euclidean plane: given the distance from two other xed points (the prototypes), multiple solutions are possible, but knowing the distance from another Theory and Hypotheses 83

xed point is sucient to single one out. In our model, type-based category labels convey information about distances: more specically, they encode that the distance between a product and the category prototype is smaller than some minimal threshold required for category membership. General- izing the trilateration example to a space of D > 2 dimensions, more than three distances can be required to derive the coordinates of a variable point; however—and this is the gist of our argument—the minimum number of labels required for a single solution to be determinable is constrained 3 by the space’s dimensionality. Formally, this is because the coordinates of a variable point in an D-dimensional Euclidean space can be determined by solving a system of D + 1 equations (assuming that the point exists):

d (x, y )2 = (f − f )2 + ... + (f − f )2  1 1x 1y1 Dx D y1 . . , (3.1)  2 2 2 d(x, yi ) = (f1x − f1yi ) + ... + (fDx − fD yi )  where d is a Euclidean distance, f1 ... fD are the coordinates (features) of a point in the space, x is the variable point (the product), yi is the i -th xed point (the i -th prototype), and i = 1 ... D +1. A cognate way to conceptualize this problem is by picturing the boundaries of each type-based category as a D-sphere centered on the prototype. The intersection of D − 1 D-spheres (e.g., two regular spheres in a three-dimensional space) is a circle. Hence, any product to which the D − 1 type-based category labels are applied ought to be in this circle. Given an additional D-sphere, the intersection reduces to two points, and given yet another, it reduces to only one point. Informally, this means that as consumers are provided with more and more type-based category labels they see fewer and fewer ways in which the features associated with membership in these categories can be combined in a way that remains consistent all the category labels. Our conjecture is that, if evaluators are given a number of labels that exceeds the space’s dimensionality, the feature prole of a product can be determined with the same level of accuracy as if there were only one label, although this requires some cognitive eort. Because human beings are unable to process spaces of much higher dimensionality than our ctitious example [78], we expect D + 1 to be a relatively low number. It follows that, in markets that are suciently diverse for products to span a large number of type-based categories, some products may span enough for this cognitive mechanism to produce its eect. Therefore, we should be able to witness a relative increase in the average evaluation as the number of type-based labels approaches D + 1. Notice that this does not mean that 84 Prototypes, Goals, and Cross-Classification

the evaluations received by very atypicals product will be ever as high as those received by very typical ones, because on the one hand it remains the case that category spanning decreases a product’s t with consumers’ tastes [17], and on the other hand, it is likely that consumers will devalue atypical products for the cognitive eort they require. Still, evaluators should incur lower uncertainty when they are able to derive the products’ features with a greater level of accuracy, and this should be sucient to 3 cause an inection in the relationship between the number of type-based categories spanned by a product and its average evaluation. We thus predict that the relationship between a product’s number of type-based category labels and its average evaluation by consumers is U-shaped. Thus, we expect products with a single type-based label to be evaluated best, but products with a suciently large number of labels should be evaluated better than those with an intermediate number. One can think of this relationship as the result of the sum of two cost functions [cf. 79]. First, evaluations reect the “actual” [13] cost of having a lower t with consumers’ preferences: this function is monotonic because the more a product deviates from the prototypes, the less its average t [17]. Second, evaluations reect the perceptual cost of ambiguity: this function is non-monotonic because ambiguity is minimal either when the number of categories is one or when it approaches D +1. The sum of the two functions results in a U-shaped eect on consumers’ evaluations. Therefore: Hypothesis 3.1. The relationship between a product’s number of type-based category labels and its average evaluation by consumers is U-shaped.

3.2.3. Suitability for Multiple Goals We now turn our attention to the consequences of spanning categories that revolve around goals as opposed to prototypes. As mentioned before, the crucial distinction between type- and goal-based categories is that the goodness of membership in a goal-based category does not increase with centrality or resemblance to a prototype [67]. Given a set of objects that belong to the same goal-based category, computing an average of the objects’ features will not necessarily result in an object that is consistent with membership in the goal-based category, let alone one that may be considered its best representative. This is because the suitability of an object for a particular purpose depends on its features but not on its distance from a specic reference point [cf. 14]: in Barsalou’s classic example [80], the category of things to take from one’s home during a fire can include as diverse members as dogs, children, and blankets. Likewise, the category of foods to eat while driving [33] can include relatively dissimilar Theory and Hypotheses 85 objects like apples and granola bars. It does not include oranges, though these are closer to apples in feature space than apples are to granola bars. Even if they are unrelated to prototypes, the members of a particular goal-based category are likely to have some features in common: the things one may salvage from home during a re, for instance, tend to be either moving or easily movable, and the things one nds convenient to eat at the wheel do not usually require peeling. Such regularities exist not because the category members resemble some central instantiation of a concept, 3 but because particular features or combinations thereof are instrumental to achieving certain goals. For this reason, two nearly identical objects will most likely belong to the same goal-based category, but objects in the same goal-based category are not necessarily similar to one another. This asymmetry has an important implication for categorization in markets: relatively dissimilar products can belong to the same goal-based category as long as they help consumers address the same need [65, 68]. They would remain members of this particular category even if they diered in every other respect. Arguably, this perspective explains demand elasticities that would hardly make sense from a prototype-centered view [81]. In our geometric model, this key distinction between type- and goal- based categories implies that goal-based categories do not necessarily map to convex regions of the feature space. To illustrate this argument, consider again the four-dimensional market presented in Figure 3.1. Suppose that a certain goal were relevant for consumers in this context that required products to fulll a specic condition, namely that the two left-hand quad- rants on its surface have exactly the same color (f2 = f3). Because it is a goal rather than a prototype serving as the engine of categorization, we 1 refer to the category in notation as goal(f2 = f3). Eight feature proles ì ì ì are consistent with membership in this category, including f1, f4–6, f11–13, ì and f16. However, these are not necessarily separated by proles that are themselves included in the category, as would be required by convexity. The absence of convexity aects the information consumers can derive from products’ category labels: if all they know about a product is that it is a member of goal(f2 = f3), then they are unable to pinpoint its location in the feature space as accurately as if they had a prototype. They can infer that the two left-hand quadrants have the same color, but they do not know which color it is—the actual coordinates thus remain uncertain. Unlike the case of type-based categories, this ambiguity is not neces- sarily reduced if the product’s number of goal-based category labels is

1Although this notation is slightly cumbersome, we nd it appropriate because it reects the notion that goal-based categories depend on features but not on a feature prole. 86 Prototypes, Goals, and Cross-Classification

suciently high. In fact, no minimum number of labels in a goal-based system guarantees that the product’s position is geometrically derivable due to the absence of reference points. For example, suppose that a sec- ond goal-based category goal(f1 = f4) were relevant for consumers, which required products to have the same color on the two right-hand quadrants. If a product were labeled both goal(f2 = f3) and goal(f1 = f4), its feature ì ì ì ì prole could be any of the following: f1, f6, f11, or f16. In fact, these are the 3 only proles that satisfy both goal-driven conditions. Once again, these positions are not adjacent in the feature space; in other words, they do constitute a convex set. Now suppose that a third goal-based category Í4 existed in this market, namely goal( i =1 fi = 2), which required products to have the same color on no more than two quadrants (whichever they are). Í4 A product labeled goal(f2 = f3), goal(f1 = f4), and goal( i =1 fi = 2) could be ì ì traced to either f6 or f11: again, these positions are not adjacent to each other, and while it is true that having additional labels allows consumers to rule out some options, the number of labels required to obtain a unique solution is not strictly constrained by the space’s dimensionality. Ultimately, this is because goal-based categories do not convey information about distances: therefore, the trilateration-like mechanism described in the previous section is not available to evaluators. With the exception of some limit cases,2 having a greater number of labels in a goal-based system does not decrease consumers’ uncertainty. This argument does not imply that no relationship exists between a product’s number of goal-based category labels and its average evaluations by consumers. As in the case of type-based categories, having multiple category memberships in a goal-based system results in a lower t with individual needs because “producers face technological barriers to serve multiple consumer goals optimally” [68, p. 242]. This is the reason why, for example, high-technology start-ups that pursue fundamental science are deemed unlikely to produce commercial applications [35], and foods considered appropriate for breakfast tend to be inopportune at a dinner party [34]. Consistently with the ecological principle of allocation, mul- tiple goal-based category labels imply that the product is less eective on average at each of the needs it purports to fulll. Because a greater number of labels does not simultaneously contribute to evaluations by way

2Two cases exist whereby consumers can locate a product thanks to a greater number of goal-based labels: (1) if one of the goals is so dicult to fulll that only one position in the feature space is suitable, and (2) if the goals are so dicult to fulll concurrently that only one position is even partly suitable. These are extreme cases and they are unlikely to occur in any complex real-world domain. Theory and Hypotheses 87 of uncertainty reduction, we expect a linear negative eect on consumers’ evaluations. This leads to our second hypothesis: Hypothesis 3.2. The relationship between a product’s number of goal-based category labels and its average evaluation by consumers is linearly negative.

3.2.4. The Effect of Spanning in Different Systems In the previous section, we explained why having multiple goal-based cat- egory labels does not normally contribute to uncertainty reduction. We 3 now turn to explaining why spanning goal-based categories can still have positive consequences for product evaluations. So far, we implicitly as- sumed evaluators to consider products’ category memberships according to either a type- or a goal-based system, but the cornerstone of this chapter is precisely that both classication systems jointly contribute to the valua- tion of products. This is because consumers can derive information from both prototypes and goals in order to pinpoint a product in the feature space. The way in which the information available from these two systems is combined by evaluators deserves additional consideration: the net eect of spanning in both systems is not merely the sum of the two eects. Consider again the four-dimensional example in Figure 3.1: suppose that ì ì a product were labeled type(f1), type(f15), goal(f2 = f3), and goal(f1 = f4). What consumers can infer from these category labels is that, on the one ì ì hand, the product must be located somewhere between f1 and f15, and on the other hand that the product’s features must fulll the conditions f2 = f3 and f1 = f4. Merging this information allows a single feature prole ì to emerge as plausible, namely f11, because this is the only option located between the two prototypes that is simultaneously consistent with the goal- driven requirements. The level of accuracy in prediction that consumers could achieve with three prototypes is now achievable with as little as two. The presence of multiple goal-based labels thus makes additional memberships in a type-based system no longer necessary to extrapolate the product’s coordinates, and the trilateration-like mechanism described in Section 3.2.2 allows evaluators to nd a solution with fewer type-based labels than would be required otherwise. Generalizing to spaces of D > 2 dimensions, the number of goal-based labels required for this mechanism to yield a unique solution is not necessarily as low; nevertheless, agents will be allowed to rule out some options with each additional label. To make a more realistic example, consider the case of a music product spanning the genres jazz and electronic. Suppose that the feature space of music were simple enough that only two features could not be predicted on the basis of these two type-based labels, namely the presence of vocals 88 Prototypes, Goals, and Cross-Classification

and the music’s tempo. Even in this simple case, there are several ways in which a product could span these two categories and still be consistent with the category labels, some of which include vocals whereas others do not, and some of which involve a fast tempo whereas others do not. Further assume for the sake of simplicity that these features are binary. In this case, there are exactly four options: (1) fast with vocals, (2) fast without vocals, (3) slow with vocals, and (4) slow without vocals. Having a third 3 type-based category label, such as ambient or rap, would be sucient to dispel uncertainty one way or another, because the prototype of rap is only consistent with (1), and the prototype of ambient is only consistent with (4).3 However, consumers do not have any additional type-based label: instead, they know that the product is suitable for a particular goal, i.e., studying. What they may infer from this goal-based label is that the product is unlikely to have features that interfere with concentration and other high- order cognitive skills. Vocals tend to be distracting [82], and hence options (1) and (3) appear less probable. The relation between concentration and tempo, however, is far less predictable, so that (2) and (4) are still equally plausible. Suppose that consumers also knew that the product is suitable for exercising. What they can infer from this additional goal-based label is that the product is unlikely to be slow-paced because such music is hardly appropriate to enhance physical exertion [83]. Given this extra piece of information, (2) appears decidedly more likely than any other option. Through this conceptual integration, having more category labels in a goal-based classication system can moderate the negative consequences of spanning type-based categories. Of the two cost functions underlying the U-shaped relationship predicted in Hypothesis 3.1, spanning goal-based categories should not aect the “actual” cost that is due to the allocation principle—this ensues regardless of the products’ number of goal-based category memberships—however, it should mitigate the perceptual cost of ambiguity. By facilitating the identication of a product’s feature prole among a set of alternatives that would be equally plausible in the presence of type-based category labels alone, having more goal-based labels can help consumers predict the features of atypical objects. This can induce a subtle but signicant change in the U-shaped relationship: the curvature should remain the same, meaning that the relative dierence between products with the same number of goal-based category labels but a dierent number of type-based ones is not expected to change; the slope of the

3From the liner notes of Brian Eno’s Ambient 1/Music for Airports 1978 US release: “Ambient music must be able to accomodate many levels of listening attention without enforcing one in particular; it must be as ignorable as it is interesting.” Methodology 89 curve, however, should turn from negative to positive at a comparatively low number of type-based category labels. In other words, the turning point of the U-shape is expected to shift toward the left [cf. 79]. This leads to our third and nal hypothesis:

Hypothesis 3.3. Having more goal-based category labels causes the turning point of the U-shaped relationship between a product’s number of type- based category labels and its average evaluation by consumers to shift toward the left. 3

3.3. Methodology 3.3.1. Empirical Setting We test our three hypotheses on data collected from AllMusic.com, a popu- lar online platform that provides editorial information, user ratings, and category memberships for thousands of music products. The market for recorded music, and especially popular music [84], makes a suitable setting for our empirical analysis because the endemic conditions of oversup- ply and uncertainty make category labels crucial to consumers’ decision- making. Moreover, this context is germane because dierent classication systems are simultaneously relevant to audiences. The most widely known system is the genre taxonomy [85]: as explained in Chapter2, these cate- gories encode resemblance to specic combinations of features [37]. Genre categories play a key role in organizing the music landscape both from a supply- [86] and a demand-side perspective [87]. Because music is con- sumed for the purpose of arming one’s social identity [88], audiences care about products’ conformity to certain aesthetic canons [89]. Consis- tently with this rule, prototypical music is generally found more appealing among non-expert audiences: for example, Smith and Melara [90] found that among novices, undergraduate students, and graduate students of music, all three groups were highly sensitive to atypicality but only graduate students preferred atypical compositions. Genres are not the only relevant classication system in this context, however. Beyond the categories and subcategories that comprise the genre taxonomy, music is widely cross-classied according to the emotions or moods it evokes [39–42]. “That music is an especially powerful stimulus for aecting moods is no revelation; it is attested to throughout history by poets, playwrights, composers, and, in the last two centuries, researchers” [38, p. 94]. The longstanding evidence of this connection that oered in cognitive psychology [91] is corroborated by many observational studies in marketing and consumer research [92–94]. Music moods can be considered 90 Prototypes, Goals, and Cross-Classification

goal-based categories because, like art in general, music is a hedonic good and its consumption is driven by emotional arousal [72]. A product’s capacity to evoke particular moods clearly depends on its features [95]; however, it does not normally depend on conformity to a particular genre. Mood categories like melancholy, energetic, and playful can encompass products from as diverse genre categories as rock, jazz, and Latin. AllMusic is a well-established source of category information within the 3 music business. It was originally launched in 1991 as a physical encyclope- dia, but its content soon acquired proportions unt for printing and the database became freely accessible online as early as 1995. At the end of 2007, both the website and the database were acquired by Rovi Corpora- tion. In 2013, ownership of the website was sold to a spin-o company, All Media Network, which is currently licensed by Rovi to use the data. The same license is granted to other companies, including Amazon, Microsoft, and Apple. The website currently receives more than eight million visits per month on average, making it the eighth music website worldwide by trac. Its editorial content is relayed by “virtually all digital music services,” [96] including iTunes, Napster, Pandora, Shazam, Slacker, and Spotify. At present, the website is maintained by some 900 critics, who assign category labels to products as part of their editorial routine. The reviewing process also involves rating the product on a scale from one-half to ve stars. As of 2013, registered users can also rate products on the website. Using this data for our empirical analysis entails some unavoidable limitations. One of these is that we analyze products released since 1995, the year the database was put online, but the genre taxonomy can change over time, as does the meaning of individual categories [97, 98]. These changes may have induced AllMusic editors to revisit the labels previously assigned to some products. To assess whether this is the case, we systematically cross-referenced our data with the 2001 paperback edition of The All Music Guide. For the purpose of this check, we randomly selected 200 records in our sample that were released before the guidebook’s publication: these include relatively popular products, like Eminem’s The Slim Shady LP (633 ratings), as well as products that were never rated by AllMusic users at all. In 93.4-percent of cases, the categories attributed to products on the website were found to be consistent with the print edition. Some categories were merged or split over time: for example, prog-rock/art rock is a single category in the guidebook, but prog-rock and art rock are separate categories on the website.4

4We account for this empirically by controlling for the categories’ similarity. Methodology 91

Another possible limitation is that AllMusic lists category labels only at the level of full-length albums, or shorter products like singles and EPs, but not individual tracks. This concerns our analysis because an can span categories either by having a tracklist of atypical songs or by having a tracklist of highly typical songs that belong to dierent categories. While it is impossible to account for this heterogeneity due to unavailability of data, we can at least control for the number of tracks included in each album and use an empirical strategy that accounts for artists’ idiosyncrasies. 3 Finally, our analysis can be aected by the fact that user ratings were only added to the website in 2013, one year before data collection. This is an advantage inasmuch as all the ratings we observe were cast by users who had access to the same information; however, it can be a drawback because users may have held prior beliefs about the value of products. We are unable to rule out this possibility by restricting our sample only to the most recent products because this would limit sample size and make it impossible to properly capture artist-level eects. Notwithstanding these limitations, AllMusic satises some important desiderata. First, it explicitly lists category labels in the form of tags, which is important because, as noted in Section 3.2.1, consumers may only engage in multiple-category induction if they are explicitly “reminded” of the dierent category labels [74]. A review describing the music in a narrative fashion may not serve this purpose equally well. Second, the category labels are listen in alphabetical order: therefore, the rst label is not necessarily the one where the product is more typical. This alleviates the concern that in the presence of a long list of labels consumers may default to the rst. Third, the AllMusic data is reliable in that the website employs human experts to assign mood category labels as opposed to automated methods [99]. For this reason, AllMusic moods are frequently used as a ground truth corpus in music information retrieval to train and benchmark various classier algorithms [e.g., 42]. Researchers in this nascent eld argue that category labels assigned to products by experts are more reliable and consistent than those assigned by consumers themselves [41].

3.3.2. Sample and Variables We focus our analysis on products released in the US during the years 1995– 2014. Although the AllMusic database includes hundreds of thousands of records, we restrict our preliminary sample to 135,536 unique releases that could be cross-referenced with other large databases, including Discogs and MusicBrainz (cf. Chapter2). This allows us to lter out inaccurate or duplicate entries as well as reissues, remastered versions, and limited, international, 92 Prototypes, Goals, and Cross-Classification

or deluxe editions. After dropping these observations, we are left with 80,917 original products. We further exclude 11,105 that are either anonymous or credited to more than one artist (or band), leaving us with 69,812 records. Most of these records were never rated by AllMusic users: only 4,708 or 6.7-percent have at least one rating. Because consumers’ evaluations are unavailable for non-rated products, we drop these observations from our sample. It is legitimate to wonder whether this leads to selection bias: we 3 perform t -tests to verify that the mean number of category labels in our sample does not signicantly change after non-rated products are dropped. The results of these tests show that the null hypothesis of equal variance cannot be rejected (p = 0.104 in the case of type-based category labels; p = 0.541 in the case of goal-based ones). We thus conclude that selection bias is not of substantive concern. As a last step, we drop 1,320 observations for which the category labels are missing. Our nal sample thus includes 3,388 products. These are authored by 1,665 distinct artists (or bands). Consistently with the industry’s reputation as a highly concentrated business [100], about half of the products in our sample were released by one of the major record companies, including Universal, Sony, Warner, PolyGram (until 1999), Bertelsmann (until 2004), and EMI (until 2012), or by any of their subsidiaries and imprints. All other products were released by independent rms or self-released by the artists.

Dependent variable. Our outcome of interest is the average evaluation of products by AllMusic users (MeanRating). Registration on the website is free and the ratings are expressed on a 10-point scale, with the average being automatically approximated to the nearest integer. The products in our sample have 577,922 ratings in total, with 83.5-percent having at least 10 ratings, 53.3-percent having at least 50, 36-percent having at least 100, and 9.3-percent having 500 or more. With a mean of 8.14, the distribution of ratings shows a tendency towards higher scores, which seems to be common in online evaluations of cultural products [cf. 101]; however, 88.9-percent of observations are within one std. dev. of the mean and no more than 5.6-percent have extreme values. Figure 3.2 plots the distribution of mean user rating against the natural logarithm of the number of ratings, with the local regression (LOESS) curve and its 95-percent condence interval. The scale of MeanRating is strictly discrete: the jitter along the horizontal axis in this gure is only added to enhance visualization.

Independent variables. Every product in the AllMusic database is asso- ciated with a variable number of genre, style, and mood category labels. Methodology 93

3

Figure 3.2: Distribution of the mean rating by number of ratings

Genres and styles represent two levels of a nested classication system (cf. Chapter2): in this study, we use styles rather than genres as the type- based classication system because these capture family resemblance at a much ner level of detail [37]; however, we control for systematic dier- ences in the evaluation of products that belong to dierent genres through binary variables. The count of products in each genre is visualized in Figure 3.3. A total of 21 genres, 509 styles, and 278 moods are represented in our sample. As a word of caution, we do not assume AllMusic users (or editors) to know all of these categories: our arguments only require each user to be familiar with some of those associated with the products they rate. Moreover, we do not assume the users to agree with the categorization chosen by the editors; only that they are able to interpret some of these category judgments. Both assumptions seem reasonable in our setting because the users ought to register on the website to rate the products, and this basic act of engagement suggests a minimal level of familiarity with the way AllMusic works. Table 3.1 reports the top 20 style and mood labels by frequency in our sample. 94 Prototypes, Goals, and Cross-Classification

3

Figure 3.3: Frequencies of products in each genre

Some pairs between these labels, like the styles alternative/indie rock and indie rock, or the moods rousing and energetic, are likely to have similar meanings. It is necessary to account for this overlap because a product with multiple labels is not much of a spanner if the categories map to the same positions in the feature space [19]. Consistently with previous research [e.g., 9, 29], we control for category similarity by calculating the pairwise Jaccard coecients between category labels. Table 3.2 reports some of the category pairs in each system, along with their similarity scores. In the case of styles, the coecients range between zero and one, whereas in the case of moods they only range between zero and 0.4, which implies lesser overlap on average. This makes sense given that there are nearly twice as many styles than moods in our sample. We use the coecients to compute two control variables (StyleSimilarity and MoodSimilarity). If a product spans exactly two categories, these variables are equal to the coecient of the two labels; if it spans more than two categories, they are equal to the mean coecient of all label pairs; if it does not span categories, the variables are set to one. Methodology 95

Table 3.1: Top style and mood category labels by frequency

Styles Moods Category label Freq. Category label Freq. Alternative/Indie Rock 1939 Rousing 1176 Alternative Pop/Rock 925 Energetic 1135 Indie Rock 579 Reective 1101 Heavy Metal 422 Stylish 941 3 Adult Alternative Pop/Rock 303 Earnest 915 Contemporary Pop/Rock 297 Playful 899 Club/Dance 279 Condent 893 Punk Revival 274 Aggressive 889 Alternative Metal 218 Passionate 842 Hard Rock 217 Dramatic 835 Punk-pop 177 Intense 813 Post-grunge 170 Brash 756 Dance-pop 169 Bittersweet 716 Indie Pop 160 Theatrical 709 Punk/New Wave 144 Freewheeling 682 Album Rock 139 Cathartic 675 Electronica 139 Intimate 666 Pop 133 Brooding 653 Indie Electronic 124 Literate 652 Contemporary R&B 118 Exuberant 649

In addition to similarity, it is necessary to account for the fact that some categories, like pop or intense, have relatively fuzzy meanings, whereas others, like third wave ska revival or celebratory, are relatively specic. Categories that have more more clearly dened meanings stand out more sharply from other categories in the same classication system—in eco- logical terms, they have greater contrast [11]. As explained above, contrast can aect the information consumers derive from products’ category mem- berships because higher-contrast categories encompass fewer positions in the feature space. We compute two separate variables (StyleContrast and MoodContrast) to control for the average contrast of the categories spanned by the products in each system. Each variable is calculated on the basis of products’ grades of membership, as is common in the ecological literature [28, 77]: to this purpose, we rst compute the mean grade of membership 96 Prototypes, Goals, and Cross-Classification

Table 3.2: Style and mood pairs by Jaccard similarity

Style A Style B Coef. Early American Blues Work Songs 1.00 Indian Classical Raga 0.92 Bulgarian Bulgarian Folk 0.80 Country Blues Acoustic Blues 0.71 3 Southern Gospel Traditional Gospel 0.60 Sound Art Sound Sculpture 0.50 Contemporary Reggae Reggae-pop 0.40 Uptown Soul Chicago Soul 0.30 East Coast Blues Vaudeville Blues 0.20 Mexican Traditions Madagascan 0.10 Mood A Mood B Coef. Demonic Macabre 0.40 Reective Intimate 0.36 Confrontational Aggressive 0.32 Laid Back/Mellow Soothing 0.28 Celebratory Exuberant 0.24 Brooding Bittersweet 0.20 Whimsical Playful 0.16 Harsh Paranoid 0.12 Theatrical Somber 0.08 Druggy Irreverent 0.04

of each product as the reciprocal of the number of categories to which the product belongs. Then, we calculate the contrast of each category by taking the mean grade of membership of category members (cf. Chapter3). Finally, for each product, we calculate the mean contrast of the categories spanned in each classication system. The resulting measure increases if the categories to which the product belongs tend to be more exclusive. We use the category labels assigned to each product to create two main predictors (NoStyles and NoMoods). These are count variables represent- ing the total number of categories spanned in each classication system. Though their distribution is skewed, we avoid transforming these variables (e.g., by computing the inverse, the square root, or the natural logarithm) because our hypotheses directly concern the number of category labels and keeping this value untransformed allows for a more adequate test. Methodology 97

3

Figure 3.4: Distribution of the number of moods by number of styles

Figure 3.4 presents the two variables’ distribution. In this gure, the size of each circle increases with the number of users who rated the products with a particular combination of values, and a darker hue indicates a greater concentration of products at particular values. Seven observations, marked by initials in the gure, stand out for their very high number of moods. Closer inspection of these data points reveals the reason for their anomaly: these are long, celebratory compilations by very well-established and often very eclectic artists. Namely, EJ stands for Elton John’s The Greatest Hits, 1970–2002 (34 tracks, 157 minutes); SW stands for Stevie Wonder’s At the Close of a Century (70 tracks, 312 minutes); DB represents David Bowie’s Bowie at the Beeb: The Best of the BBC Radio Sessions, 1968–1972 (53 tracks, 218 minutes); TB represents One by The Beatles (27 tracks, 79 minutes); BB stands for Sounds of Summer: The Very Best of (30 tracks, 76 minutes); JH1 represents The Jimi Hendrix Experience’s self-titled compilation (56 tracks, 259 minutes); and JH2 represents their Winterland box set (36 tracks, 278 minutes). Although these observations are far from the median of NoMoods, they do not have a high 98 Prototypes, Goals, and Cross-Classification

leverage on the regression lines and their exclusion does not substantively aect our estimates. We compute a binary variable (Compilation) to control for whether the focal product is a box set, an anthology, a collection of greatest hits. More- over, we control for dierences in the number of tracks by regressing on the natural logarithm of the size of the tracklist (NoTracks). We also compute a binary control variable to capture possible identity or authenticity eects 3 in the evaluation of products released by independent vs. major record companies (IndieRelease). We control for the products’ release year through a set of 19 dummy variables.5 Consistently with previous research [e.g., 17], we also control for dierences in the size of the audience by regressing on the natural logarithm of the number of ratings (NoRatings). Finally, we control for dierences in the products’ editorial evaluation (EditorRating): on the one hand, this is necessary because consumers’ evaluations can be inuenced by the editors’ judgment; on the other, it is useful to capture quality-related heterogeneity among products and decouple the perceptual from the “actual” eects of category spanning [13].

3.3.3. Estimation Procedure It is relevant to note that a considerable share of the variance we wish to explain in MeanRating can be due to unobserved artist-level characteristics, such as status [26] or tenure in the industry [20]. We must account for this heterogeneity because higher-status or longer-established artists may aord to span categories more than the average, and conversely, lower- status or younger artists may be subject to greater constraints. Hierarchical or multilevel models [102] are explicitly designed to handle this source of bias: using a restricted maximum likelihood (REML) estimator, these allow the isolation of variance components at dierent levels of analysis. In our study, we use a two-level, mixed model specication with xed eects at the level of products and random eects at the level of artists. This model accounts for the correlations among regression residuals for products “nested” within the same artist, thereby ruling out the eects of unmeasured artist-level determinants. To capture the hypothesized curvilinear eect of NoStyles on MeanRating, we use a quadratic polynomial for the independent variable [103]. To capture the interaction between NoStyles and NoMoods, both terms of the polynomial are multiplied with the moderating variable [79, 104]. The full model, which includes the quadratic and interaction terms, takes the following form:

5Including ReleaseYear as a continuous variable does not change our model results. Results 99

MeanRating = α + β1ReleaseYear2 ··· β19ReleaseYear20 + β20Genre1 ··· β40Genre21

+ β41Compilation + β42NoRatings + β43NoTracks + β44EditorRating

+ β45IndieRelease + β46StyleSimilarity + β47StyleContrast

+ β48MoodSimilarity + β49MoodContrast + β50NoStyles 2 + β51NoStyles + β52NoMoods + β53NoStyles × NoMoods 2 + β54NoStyles × NoMoods + ν + ε 3 (3.2) where ν is an artist-level error drawn from a distribution N (0, σν) and ε is a product-level error drawn from a distribution N (0, σε). The intra-class correlation (ICC) coecient suggests that this model is appropriate: about 32.5-percent of the variance in MeanRating is due to artist-level dierences and is thus captured by ν. Consistently with this, likelihood-ratio tests show that our mixed models t the data better than generalized linear models without the artist-level component (p < 0.001).

3.4. Results Table 3.3 reports the descriptive statistics for the main variables involved in our analysis. For brevity, we omit the 21 binary variables that capture genre memberships. Although our dataset includes a much greater number of styles than moods, the mean number of category labels for products in our sample is much larger in the goal-based system (p < 0.001). The contrast and the similarity of styles spanned by each products also tends to be higher than that of spanned moods (p < 0.001). Table 3.4 reports the pairwise correlations between these variables. Our two predictors of theoretical interest are positively but weakly corre- lated with one another (r = 0.15, p < 0.001), which suggests that, although atypical products tend to be suitable for a greater number of moods, the observations in our sample can span styles and moods independently. To assess the risk of multicollinearity, we perform conditioning diagnostics on the model matrix. The condition number of the matrix is 67.03, which is well above the recommended threshold of 30 [105], but standardizing the variables helps addressing this problem and reduces the number to a much more acceptable 3.45. Thus, we are reassured that the interdepen- dencies between our regressors do not aect our estimates. In the tables below, we report the estimated coecients for standardized variables: the intrepretation of these parameters is relatively straightforward because the dependent variable has a std. dev. of approximately one; therefore, any 100 Prototypes, Goals, and Cross-Classification

Table 3.3: Descriptive statistics

Mean Std. dev. Min. Max. MeanRating 8.14 1.01 1 10 ReleaseYear 2004.24 5.43 1995 2014 Compilation 0.10 0.30 0 1 NoRatings 4.00 1.61 0 8.42 3 NoTracks 2.60 0.38 0 5.59 EditorRating 6.32 1.36 2 9 IndieRelease 0.47 0.50 0 1 StyleSimilarity 0.15 0.16 0 1 StyleContrast 0.28 0.04 0.18 1 MoodSimilarity 0.12 0.09 0 1 MoodContrast 0.08 0.03 0.06 0.35 NoStyles 3.85 1.56 1 11 NoMoods 15.14 7.45 1 71

coecient can be interpreted as the eect of a one-std. dev. increase in the predictor on the original value of the dependent. The results our mixed models are reported in Tables 3.5 and 3.6. We begin our analysis in Model 1 by regressing MeanRating on all the non- category-related variables as well as the 19 dummies for ReleaseYear. The model results suggest that compilations tend to be evaluated better by users than ordinary albums, as the estimated rating is approximately 0.6- point higher (p < 0.001). This eect is unsurprising if we consider that compilations usually include an artist’s most successful tracks. Products rated by a greater number of users also tend to be rated better on average (βˆ = 0.148, p < 0.001), as do products released by independent record companies (βˆ = 0.076, p = 0.020). A higher evaluation by AllMusic editors is also associated with a higher mean rating (βˆ = 0.418, p < 0.001). In subsequent models, we add the category-related control variables to our list of regressors. This includes the 21 binary variables for genre categories (Model 2) as well as the four contrast and similarity variables (Model 3). The results of systematic likelihood-ratio tests suggest that the contrast and similarity variables do not signicantly aect MeanRating (Model 3 vs. Model 2: p = 0.507); the genre controls, however, decisively contribut to model t (Model 2 vs. Model 1: p < 0.001). All the estimates reported above remain consistent in size and direction throughout the fol- Results 101

Table 3.4: Pairwise correlations matrix

1 2 3 4 1 MeanRating 2 ReleaseYear −0.15••• 3 Compilation 0.23••• −0.10••• 4 NoRatings 0.14••• 0.08••• −0.23••• 5 NoTracks 0.13••• −0.05•• 0.41••• −0.08••• 3 6 EditorRating 0.51••• −0.04•• 0.16••• 0.20••• 7 IndieRelease 0.03 0.16••• −0.11••• −0.19••• 8 StyleSimilarity −0.01 −0.05•• −0.05•• −0.11••• 9 StyleContrast −0.06•• 0.04• −0.17••• −0.08••• 10 MoodSimilarity 0.00 −0.15••• −0.03 −0.10••• 11 MoodContrast 0.00 −0.15••• −0.05•• −0.16••• 12 NoStyles 0.07••• −0.22••• 0.22••• 0.06••• 13 NoMoods 0.01 0.22••• 0.21••• 0.36••• 5 6 7 8 6 EditorRating 0.12••• 7 IndieRelease −0.12••• 0.06••• 8 StyleSimilarity 0.04• −0.04• −0.05•• 9 StyleContrast 0.03 −0.01 −0.14••• 0.27••• 10 MoodSimilarity −0.01 −0.04• −0.03 0.04• 11 MoodContrast −0.02 −0.05•• −0.05•• 0.07••• 12 NoStyles 0.11••• 0.09••• −0.02 −0.40••• 13 NoMoods 0.13••• 0.15••• −0.04• −0.09••• 9 10 11 12 10 MoodSimilarity 0.01 11 MoodContrast 0.05•• 0.33••• 12 NoStyles −0.42••• −0.02 −0.04• 13 NoMoods −0.12••• −0.28••• −0.28••• 0.15•••

Note: • p < 0.05; •• p < 0.01; ••• p < 0.001 lowing models, and the eects of NoRatings and Compilation further increase in their level of statistical signicance. In Model 4, we add our predictors of theoretical interest, NoStyles and NoMoods. At this stage, we include only the rst-order variable of NoStyles. The goodness of t of our model signicantly increases (Model 4 vs. Model 3: 102 Prototypes, Goals, and Cross-Classification

Table 3.5: Mixed model results: Controls

Dependent variable: MeanRating Model 1 Model 2 Model 3 Intercept 0.17 • (0.07) 0.31 ••• (0.08) 0.31 ••• (0.08) 3 Compilation 0.59 ••• (0.06) 0.62 ••• (0.06) 0.63 ••• (0.06) NoRatings 0.15 ••• (0.02) 0.18 ••• (0.02) 0.18 ••• (0.02) NoTracks 0.02 (0.02) 0.02 (0.02) 0.02 (0.02) EditorRating 0.42 ••• (0.01) 0.41 ••• (0.01) 0.41 ••• (0.01) IndieRelease 0.08 • (0.03) 0.09 •• (0.03) 0.09 •• (0.03) StyleSimilarity −0.01 (0.01) StyleContrast −0.00 (0.02) MoodSimilarity −0.00 (0.01) MoodContrast 0.03 (0.01)

Intercept σν 0.51 0.48 0.48 Residual σν 0.67 0.68 0.68 Genre variables Not included Included Included Release year dummies Included Included Included No. products 3388 3388 3388 No. artists 1665 1665 1665 Log likelihood −4096.39 −4075.73 −4086.82

Note: • p < 0.05; •• p < 0.01; ••• p < 0.001; std. errors in parentheses

p < 0.001), and the estimates suggest a negative and highly signicant eect for NoMoods (βˆ = −0.116, p < 0.001) as well as a small but highly signicant negative eect for NoStyles (βˆ = −0.064, p < 0.001). We suspect that the small size of this eect is because a straight regression line does not adequately capture the relationship between a product’s number of styles and its average evaluation by AllMusic users. We continue our analysis in Model 5 by adding the quadratic transfor- mation of NoStyles to our list of regressors. The estimates from this model specication conrm the suspicion above, as the inclusion of NoStyles2 results in highly signicant coecients for both terms of the polynomial (βˆ = −0.111, p < 0.001, and βˆ = 0.040, p < 0.001, respectively) and a better overall t for our model (Model 5 vs. Model 4: p < 0.001). We formally test the signicance of the U-shaped relationship between NoStyles and Results 103

Table 3.6: Mixed model results: Main eects and interaction

Dependent variable: MeanRating Model 4 Model 5 Model 6 Intercept 0.23 •• (0.08) 0.20 • (0.08) 0.21 • (0.08) Compilation 0.75 ••• (0.06) 0.75 ••• (0.06) 0.74 ••• (0.06) 3 NoRatings 0.23 ••• (0.02) 0.24 ••• (0.02) 0.24 ••• (0.02) NoTracks 0.02 (0.02) 0.02 (0.02) 0.02 (0.02) EditorRating 0.41 ••• (0.01) 0.41 ••• (0.01) 0.41 ••• (0.01) IndieRelease 0.10 ••• (0.03) 0.10 ••• (0.03) 0.10 ••• (0.03) StyleSimilarity 0.02 (0.02) −0.05 •• (0.02) −0.05 ••• (0.02) StyleContrast −0.03 (0.02) −0.04 (0.02) −0.05 •• (0.02) MoodSimilarity −0.02 (0.01) −0.02 (0.01) −0.02 •• (0.01) MoodContrast 0.01 (0.01) 0.01 (0.01) 0.01 (0.01) NoStyles −0.06 ••• (0.02) −0.11 ••• (0.02) −0.11 ••• (0.22) NoStyles2 0.04 ••• (0.01) 0.03 •• (0.01) NoMoods −0.12 ••• (0.02) −0.12 ••• (0.02) −0.14 ••• (0.02) NoStyles 0.06 ••• (0.02) × NoMoods NoStyles2 −0.01 (0.01) × NoMoods

Intercept σν 0.46 0.46 0.46 Residual σν 0.68 0.68 0.68 Genre variables Included Included Included Release year dummies Included Included Included No. products 3388 3388 3388 No. artists 1665 1665 1665 Log likelihood −4062.90 −4050.44 −4039.83

Note: • p < 0.05; •• p < 0.01; ••• p < 0.001; std. errors in parentheses

MeanRating using the three-step procedure recommended by Lind and Mehlum [103]. According to this method, three conditions must hold for the curvilinear eect to be safely interpretable: rst, the coecients of NoStyles and NoStyles2 must be signicant and in the espected direction; second, the slope of the curve must be signicantly dierent from zero at both ends of the observed range of NoStyles; third, and related to the above, 104 Prototypes, Goals, and Cross-Classification

the turning point of the curve must be located within the observed range of NoStyles. The second and third conditions are important because, if they do not hold, there is not enough evidence to accept the hypothesis of a U-shaped relationship. The true relationship may actually be only half of a U-shape, which could be more parsimoniously tted through a logarithmic function of the independent variable. In our case, the relationship fullls the three aforementioned criteria. 3 The coecients are in the expected direction: negative for the rst-order term of the polynomial and positive for the second-order term. Two t -tests performed according to Lind and Mehlum’s procedure [103] suggest that the slope of the curve is signicantly smaller than zero when the variable is at its lowest (mbL = −0.257, p < 0.001) but signicantly greater than zero when the variable is at its highest (mbH = 0.255, p = 0.001). The point where the slope switches from negative to positive, i.e., the turning point of the U-shape, is located at 1.38 std. dev. above the mean of NoStyles, which corresponds to approximately six categories. Hence, the eect reversal occurs well within the variable’s range. This implies that MeanRating does not monotonically decrease when the value of NoStyles increases: it decreases at rst, but after a large enough number of styles the average evaluation begins to increase again. To verify that this non-monotonic eect is unique to type-based categories, as implied by our theory, we estimate additional models where a similar quadratic relationship is specied for NoMoods. In this case, we nd no evidence of a U-shape (p = 0.159). We report on these models in greater detail at the end of this section. We continue our analysis in Model 6 by estimating the interaction eect. Consistently with Aiken and West [104], we account for this eect by includ- ing the multiplicative terms NoStyles × NoMoods and NoStyles2 × NoMoods to the list of regressors. The model ts the data signicantly better than the one without interactions (Model 5 vs. Model 4: p < 0.001). The results of Lind and Mehlum’s test of the U-shaped relationship [103] continue to hold (p = 0.023). Model estimates indicate a highly signicant eect for the interaction with the rst-order variable (βˆ = 0.062, p < 0.001), but not for the one with the second-order variable (βˆ = −0.007, p = 0.286), which suggests that an increase in the value of NoMoods does not signicantly aect the curvature of the U-shaped relationship between NoStyles and MeanRating. In a linear model, in fact, the curvature of such a relationship is only aected by the moderating variable if the coecient of its interaction with the quadratic term is signicantly dierent from zero [79]. Yet it is not necessary for the curvature to change in order for a signicant moderation to occur: even if the slope stays the same, a shift in the curve’s turning Results 105

3

Figure 3.5: Eect of spanning in both classication systems point remains possible. We formally test this by estimating the derivative of the U-shape’s turning point with respect to the moderating variable. This partial derivative is computed as follows [79, p. 1187]: δX ∗ β β − β β = 50 53 51 52 , 2 (3.3) δZ 2 (β51 + β53Z ) where X ∗ represents the value of NoStyles where the U-shape turns upward, β50–53 are the coecients of NoStyles, NoMoods, NoStyles × NoMoods, and NoStyles2 × NoMoods, respectively (cf. Equation 3.2), and Z is an arbitrary but meaningful value within the observed range of NoMoods. We perform this test repeatedly by setting Z at dierent values located within one std. dev. of the mean of NoMoods. Throughout our tests, we obtain negative and signicant estimates (p < 0.05), which constitutes formal evidence that the turning point shifts to the left when the value of NoMoods increases. The interaction between these two variables is graphically presented in Figure 3.5. In this plot, the gradient represents the changing eect of NoStyles as the value of NoMoods increases from 10 to 25, 106 Prototypes, Goals, and Cross-Classification

which covers more than 70-percent of observations in our sample. Notice that the change in the curvature is not statistically signicant: however, the leftward shift of the turning point is.

Robustness tests. Post-estimation diagnostics (available upon request) suggest that our estimates are safely interpretable. The exclusion of outliers does not substantively aect the size or the signicance of our results. 3 As mentioned before, we perform additional tests to rule out the possi- bility that a U-shaped relationship exists between NoMoods and MeanRating, which would be incompatible with our theoretical argument that type- and goal-based categories have dierent internal structures. For the purpose of this validation, we estimate two additional models: one analogous to Model 4, where we also include NoMoods2, and one analogous to Model 5, where we also include NoMoods2 and its interactions with NoStyles and NoStyles2. In the rst model, i.e., the one without interactions, the U-shaped eect of NoStyles is signicant (mbL = −0.243, p < 0.001, and mbH = 0.226, p = 0.004) and the putative U-shaped eect of NoMoods seems to be signicant as well (mbL = −0.276, p < 0.001, and mbH = 0.283, p = 0.004). In the second model, i.e., the one with interactions, the U-shaped eect of NoStyles is still strongly signicant (mbL = −0.257, p < 0.001, and mbH = 0.215, p = 0.009) but the U-shaped eect of NoMoods is only marginally so (mbL = −0.257, p < 0.001, and mbH = 0.189, p = 0.069). Not only is this eect marginally signicant, but it appears to be entirely driven by outliers: re-estimating the model after dropping as little as three observations whose value of NoMoods is more than ve std. dev. away from the variable’s mean (denoted JH1, JH2, and BB in Figure 3.4) causes the curvilinear eect of NoMoods to be no longer signicant at all (mbL = −0.242, p < 0.001, and mbH = 0.147, p = 0.159). We thus conclude that there is no evidence of a U-shaped relationship between NoMoods and MeanRating. This is fully consistent with our expectations.

3.5. Discussion In this chapter, we proposed a conceptual model whereby consumers rely on dierent classication systems to localize products in a cognitive space. We explained the distinction between type- and goal-based categories, and illustrated how the fundamental dierence in their internal structure aects the information consumers are allowed to derive from products’ category labels. Building on these considerations, we predicted that the eect of spanning type-based categories on product evaluations follows a U-shape Discussion 107

(Hypothesis 3.1), whereas the eect of spanning goal-based categories is linearly negative (Hypothesis 3.2). Further, we predicted that the U-shaped eect of spanning type-based categories is moderated by the eect of spanning goal-based categories in such a way that the turning point of the U-shape moves leftward if a product spans in both systems (Hypothesis 3.3). Our empirical analysis of product ratings on an online music platform yielded sucient evidence to accept all hypotheses. With regard to Hypothesis 3.1, we found that having more type-based 3 category labels has a negative eect on consumers’ evaluations, but con- sistently with our prediction, this eect reverses if the product’s number of type-based labels is suciently large. We estimated this reversal to occur at approximately six labels on average. If we assumed that (a) AllMusic users were familiar with every prototype associated with the products they rate, and (b) that they did not devalue products that require cognitive eort, this particular estimate would suggest that consumers are unable to distinguish more than 12 features or dimensions in the feature space of popular music. This is because the average evaluation of a product with 13 (D + 1) type-based category labels is predicted to be equal to that of a product with only one type-based label, all else being equal. These assumptions are hardly realistic, however: the users may not know all the prototypes associated with a product according to AllMusic editors, and even if they did, they would still penalize records whose labels are dicult to understand [cf. 106]. In light of this, 12 is a rather generous estimate of the space’s dimensionality: the actual number of features that consumers can distinguish is probably lower. This is consistent with psychological research, according to which people tend to reduce even relatively complex domains to spaces of less than a dozen dimensions [78]. It is worth noting that the number of dimensions of the feature space has no direct relationship with the number of category labels the audience is familiar with. A space of as little as one dimension can be partitioned into an arbitrarily large number of categories if audience members are willing to make very ne-grained distinctions—in other words, the dimensionality of the feature space and that of category labels is not directly related [57]. The fact that AllMusic includes more than 500 styles does not necessarily mean that the feature space of popular music is particularly complex [cf. 107, 108], nor that consumers are able to make very ne-grained distinctions. Our ndings are compatible with previous research on the consequences of category spanning inasmuch as products partly conforming to dierent prototypes tend to receive lower evaluations by consumers, at least up to a threshold. This agrees with the categorical-imperative perspective in 108 Prototypes, Goals, and Cross-Classification

organization theory [50]. Our ndings partly diverge from this perspective, however, as we nd evidence that very atypical products receive better evaluations than moderately atypical ones. The explanation we proposed for this phenomenon hinges on the idea that type-based categories map to convex regions of the feature space [76]: because people are already familiar with these categories’ geometric centers (the prototypes), they can make relatively accurate inferences about products either when these are 3 highly typical or when they are so atypical that combining the concepts [51–53] yields a unique solution. As a result of this geometric process, consumers’ evaluations of products can be relatively high at both ends of the typicality distribution. Conforming to a specic prototype may still be the optimal strategy, however, because consumers are averse to mental eort and they are likely to penalize products if a great deal of cognitive eort is needed to pinpoint their location in the feature space. With regard to Hypothesis 3.2, we found that the eect of spanning goal-based categories is linearly negative. This implies that products are better o focusing on fewer goals, all else being equal. Our tests returned no evidence of a U-shaped relationship in this case, conrming that the information consumers can derive from type- and goal-based categories is radically dierent. This is consistent with Barsalou’s [66] research in cogni- tive psychology, according to which goal-based categories do not coalesce around some central instantiation of a concept [cf. 67] or reference point [14]. As a result, the region of feature space to which goal-based categories map is not necessarily convex. Because of non-convexity, audience mem- bers nd it harder to predict which feature proles are compatible with certain labels and cannot normally determine the coordinates of products that span goal-based categories by way of distance-based reasoning [cf. 29]. As there is ultimately no eect on uncertainty, the consequences of span- ning categories in a goal-based system entirely depend on producer-side constraints and the relationship with consumers’ evaluation is monotoni- cally negative. This agrees both with the organizational literature, which suggests that goal-consistency is desirable for evaluators [24, 31], and with marketing research, according to which products perceived to be suitable for multiple goals are perceived to be less suitable for each [68]. Finally, with regard to Hypothesis 3.3, we found that the eect of span- ning type-based categories shifts from negative to positive after fewer category memberships if the product also spans categories in a goal-based system. This supports our conjecture that evaluators tend to merge in- formation from dierent classication systems in order to make sense of products. It is easy to imagine what kind of considerations or thought Discussion 109 processes underpin this interaction eect: for example, consumers may think that addressing several needs justies deviance from prototypes. If a phone has a display wider than six inches, for instance, thereby deviating from the prototype of its category, then it better be useful for more than just calling and sending text messages. Likewise, if consumers know that a phone is good for more than calling and sending text messages, e.g., because it is advertised as suitable for watching videos or playing games, then they are likely to infer that its screen is wider than six inches and they 3 will be less inclined to devalue the product. The arguments presented in this chapter complement the various ex- planations for the positive consequences of atypicality already oered by organization scholars, yet they allow for a more parsimonious explanation because they do not rely on higher-level theoretical constructs like pro- ducer attributes [26, 27] or category properties [28]. Moreover, they do not require audience members to have particular goals in mind when evaluat- ing products [23, 24]: in this sense, our arguments are generalizable to an audience with heterogeneous needs and desires. Nevertheless, evaluators are required to possess some minimal level of experience with the domain at hand (i.e., they should know the positions of at least some prototypes) for our geometric arguments to hold, hence our ndings may not extend to settings where consumers are insuciently familiar with the objects on oer. In addition, the question remains of whether our arguments apply to the valuation of organizations as well as their products. While the con- sumers face a similar uncertainty problem, these two cases are dierent because products do not move around in the feature space: rms do, albeit with some diculties [109–111], and this may make locating their (current) position in the feature space a rather dierent exercise. The assumptions we made about consumer behavior limit the gener- alizability of our results. Though there is no shortage of contexts where consumers tend to anchor their decision on third-party information, in many of these contexts the category labels may be presented in a way that makes multiple-category induction less likely. The objective of our re- search, however, was not to argue that the dominant wisdom about category spanning in organization theory is faulty, but rather to oer proof of the dif- ferent information that type- and goal-based category labels convey to the audience. An implication of this structural dierence is that spanning type- based categories has a U-shaped eect on evaluations under the conditions we assumed (cf. Section 3.2.1); another is that this eect is moderated by the number of goal-based categories. These conditions do not necessarily hold in other contexts, so that a linear relationship between atypicality and eval- 110 Prototypes, Goals, and Cross-Classification

uations can still be perfectly adequate. Nonetheless, goal-based categories may still be relevant in these contexts and a prototype-centered perspec- tive on categorization is still likely to engender an incomplete account of the role of categories in the ordering of markets.

3.6. References 3 [1] E. H. Rosch, C. B. Mervis, W. D. Gray, D. M. Johnson, and P. Boyes-Braem, Basic objects in natural categories, Cognitive Psychology 8, 382 (1976). [2] M. Ruef and K. Patterson, Credit and classication: The impact of industry boundaries in nineteenth-century America, Administrative Science Quarterly 54, 486 (2009). [3] Ö. Koçak, M. T. Hannan, and G. Hsu, Emergence of market orders: Audience interaction and vanguard inuence, Organization Studies 35, 765 (2014). [4] E. G. Pontikes, Two sides of the same coin: How ambiguous classica- tion aects multiple audiences’ evaluations, Administrative Science Quarterly 57, 81 (2012). [5] G. Negro, M. T. Hannan, and M. Fassiotto, Category signaling and reputation, Organization Science 26, 584 (2015). [6] B. Kuijken, G. Gemser, and N. M. Wijnberg, Categorization and will- ingness to pay for new products: The role of category cues as value anchors, Journal of Product Innovation Management, 34, 757 (2017). [7] P. Nelson, Information and consumer behavior, Journal of Political Economy 78, 311 (1970). [8] G. Hsu, M. T. Hannan, and L. Pólos, Typecasting, legitimation, and form emergence: A formal theory, Sociological Theory 29, 97 (2011). [9] A. Goldberg, M. T. Hannan, and B. Kovács, What does it mean to span cultural boundaries? Variety and atypicality in cultural consumption, American Sociological Review 81, 215 (2016). [10] G. Hsu, M. T. Hannan, and Ö. Koçak, Multiple category memberships in markets: An integrative theory and two empirical tests, American Sociological Review 74, 150 (2009). [11] M. T. Hannan, Partiality of memberships in categories and audiences, Annual Review of Sociology 36, 159 (2010). [12] M. D. Leung and A. J. Sharkey, Out of sight, out of mind? Evidence of perceptual factors in the multiple-category discount, Organization Science 25, 171 (2013). [13] G. Negro and M. D. Leung, “Actual” and perceptual eects of category spanning, Organization Science 24, 684 (2013). References 111

[14] E. H. Rosch, Cognitive reference points, Cognitive Psychology 7, 532 (1975). [15] P. Winkielman, J. Halberstadt, T. Fazendeiro, and S. Catty, Prototypes are attractive because they are easy on the mind, Psychological Science 17, 799 (2006). [16] G. A. Akerlof, The market for “lemons:” Quality uncertainty and the market mechanism, Quarterly Journal of Economics 84, 488 (1970). [17] G. Hsu, Jacks of all trades and masters of none: Audiences’ reactions 3 to spanning genres in feature lm production, Administrative Science Quarterly 51, 420 (2006). [18] J.-P. Ferguson and G. Carnabuci, Risky recombinations: Institutional gatekeeping in the innovation process, Organization Science 28, 133 (2017). [19] M. D. Leung, Dilettante or Renaissance person? How the order of job experiences aects hiring in an external labor market, American Soci- ological Review 79, 136 (2014). [20] E. W. Zuckerman, T.-Y. Kim, K. Ukanwa, and J. von Rittmann, Robust identities or nonentities? Typecasting in the feature-lm labor market, American Journal of Sociology 108, 1018 (2003). [21] E. W. Zuckerman, Focusing the corporate product: Securities analysts and de-diversication, Administrative Science Quarterly 45, 591 (2000). [22] G. Hsu, G. Negro, and F. Perretti, Hybrids in Hollywood: A study of the production and performance of genre spanning lms, Industrial and Corporate Change 21, 1427 (2012). [23] T. Wry, M. Lounsbury, and P. D. Jennings, Hybrid vigor: Securing ven- ture capital by spanning categories in nanotechnology, Academy of Management Journal 57, 1309 (2014). [24] L. Paolella and R. Durand, Category spanning, evaluation, and perfor- mance: Revised theory and test on the corporate law market, Academy of Management Journal 59, 330 (2016). [25] J. Merluzzi and D. J. Phillips, The specialist discount: Negative returns for MBAs with focused proles in investment banking, Administrative Science Quarterly 61, 87 (2016). [26] B. Kovács and R. Johnson, Contrasting alternative explanations for the consequences of category spanning: A study of restaurant reviews and menus in San Francisco, Strategic Organization 12, 7 (2014). [27] M. Keuschnigg and T. Wimmer, Is category spanning truly disadvan- tageous? New evidence from primary and secondary movie markets, Social Forces 96, 449 (2017). [28] B. Kovács and M. T. Hannan, The consequences of category spanning 112 Prototypes, Goals, and Cross-Classification

depend on contrast, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 175–201. [29] B. Kovács and M. T. Hannan, Conceptual spaces and the consequences of category spanning, Sociological Science 2, 252 (2015). [30] B. H. Ross and G. L. Murphy, Food for thought: Cross-classication 3 and category organization in a complex real-world domain, Cognitive Psychology 38, 495 (1999). [31] R. Durand and L. Paolella, Category stretching: Reorienting research on categories in strategy, entrepreneurship, and organization theory, Journal of Management Studies 50, 1100 (2013). [32] N. Granqvist and T. Ritvala, Beyond prototypes: Drivers of market cate- gorization in functional foods and nanotechnology, Journal of Man- agement Studies 53, 210 (2016). [33] S. Ratneshwar, L. W. Barsalou, C. Pechmann, and M. Moore, Goal- derived categories: The role of personal and situational goals in cate- gory representations, Journal of Consumer Psychology 10, 147 (2001). [34] S. Ratneshwar and A. D. Shocker, Substitution in use and the role of usage context in product category structures, Journal of Marketing Research 28, 281 (1991). [35] T. Wry and M. Lounsbury, Contextualizing the categorical imperative: Category linkages, technology focus, and resource acquisition in nan- otechnology entrepreneurship, Journal of Business Venturing 28, 117 (2013). [36] D. N. Osherson and E. E. Smith, On the adequacy of prototype theory as a theory of concepts, Cognition 9, 35 (1981). [37] J. C. Lena and R. A. Peterson, Classication as culture: Types and trajec- tories of music genres, American Sociological Review 73, 697 (2008). [38] G. C. Bruner, Music, mood, and marketing, Journal of Marketing 54, 94 (1990). [39] M. A. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, and M. Slaney, Content-based music information retrieval: Current directions and future challenges, Proceedings of the IEEE 96, 668 (2008). [40] P. Saari, T. Eerola, and O. Lartillot, Generalizability and simplicity as criteria in feature selection: Application to mood classication in music, IEEE Transactions on Audio, Speech, and Language Processing 19, 1802 (2011). [41] P. Saari, M. Barthet, G. Fazekas, T. Eerola, and M. Sandler, Semantic models of musical mood: Comparison between crowd-sourced and References 113

curated editorial tags, in IEEE International Conference on Multime- dia and Expo Workshops, https://doi.org/10.1109/ICMEW.2013.6618436 (2013), accessed June 18, 2017. [42] K. Bischo, C. S. Firan, R. Paiu, W. Nejdl, C. Laurier, and M. Sordo, Music mood and theme classication: A hybrid approach, in Tenth Interna- tional Society for Music Information Retrieval Conference, edited by K. Hirata, G. Tzanetakis, and K. Yoshii (International Society for Music Information Retrieval, Victoria, Canada, 2009) pp. 657–662. 3 [43] Billboard, Spotify sees a way forward—by becoming more than a mu- sic service, https://www.billboard.com/biz/articles/6568964/spotify- sees-a-way-forward-by-becoming-more-than-a-music-service (2015), accessed 6 April 2017. [44] M. Schneiberg and G. Berk, From categorical imperative to learning by categories: Cost accounting and new categorical practices in Ameri- can manufacturing, 1900–1930, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 255–292. [45] E. H. Rosch, Principles of categorization, in Cognition and Catego- rization, edited by E. H. Rosch and B. B. Lloyd (Lawrence Erlbaum Associates, Hillsdale, NJ, 1978) pp. 27–48. [46] G. Carnabuci, E. Operti, and B. Kovács, The categorical imperative and structural reproduction: Dynamics of technological entry in the semiconductor industry, Organization Science 26, 1734 (2015). [47] G. Negro, M. T. Hannan, and H. Rao, Category reinterpretation and de- fection: Modernism and tradition in Italian winemaking, Organization Science 22, 1449 (2011). [48] G. Hsu, P. W. Roberts, and A. Swaminathan, Evaluative schemas and the mediating role of critics, Organization Science 23, 83 (2012). [49] H. Rao, P. Monin, and R. Durand, Border crossing: Bricolage and the erosion of categorical boundaries in French gastronomy, American Sociological Review 70, 968 (2005). [50] E. W. Zuckerman, The categorical imperative: Securities analysts and the illegitimacy discount, American Journal of Sociology 104, 1398 (1999). [51] J. A. Hampton, Inheritance of attributes in natural concept conjunctions, Memory & Cognition 15, 55 (1987). [52] E. E. Smith, D. N. Osherson, L. J. Rips, and M. Keane, Combining proto- types: A selective modication model, Cognitive Science 12, 485 (1988). [53] J. A. Hampton, Conceptual combination, in Knowledge, Concepts, and 114 Prototypes, Goals, and Cross-Classification

Categories, edited by K. Lambert and D. Shanks (MIT Press, Cambridge, MA, 1997) pp. 133–159. [54] E. H. Rosch and C. B. Mervis, Family resemblances: Studies in the internal structure of categories, Cognitive Psychology 7, 573 (1975). [55] P. Gärdenfors, Conceptual Spaces: The Geometry of Thought (MIT Press, Cambridge, MA, 2000). [56] P. Gärdenfors, The Geometry of Meaning: Semantics Based on Concep- 3 tual Spaces (MIT Press, Cambridge, MA, 2014). [57] E. G. Pontikes and M. T. Hannan, An ecology of social categories, Socio- logical Science 1, 311 (2014). [58] M. T. Hannan, G. Le Mens, G. Hsu, B. Kovács, G. Negro, L. Pólos, E. G. Pontikes, and A. J. Sharkey, Concepts and Categories: Foundations for Sociological Analysis, unpublished manuscript (2017). [59] K. J. Lancaster, A new approach to consumer theory, Journal of Political Economy 74, 132 (1966). [60] W. J. Baumol, Calculation of optimal product and retailer characteris- tics: The abstract product approach, Journal of Political Economy 75, 674 (1967). [61] G. C. Archibald and G. Rosenbluth, The “new” theory of consumer de- mand and monopolistic competition, Quarterly Journal of Economics 89, 569 (1975). [62] R. W. Shaw, Product proliferation in characteristics space: The U.K. fertiliser industry, Journal of Industrial Economics 31, 69 (1982). [63] K. J. Lancaster, The economics of product variety: A survey, Marketing Science 9, 189 (1990). [64] G. L. Murphy and D. L. Medin, The role of theories in conceptual coher- ence, Psychological Review 92, 289 (1985). [65] B. Loken and J. Ward, Alternative approaches to understanding the determinants of typicality, Journal of Consumer Research 17, 111 (1990). [66] L. W. Barsalou, Deriving categories to achieve goals, in The Psychology of Learning and Motivation: Advances in Research and Theory, Vol. 27, edited by G. H. Bower (Academic Press, Cambridge, MA, 1991) pp. 1–64. [67] L. W. Barsalou, Ideals, central tendency, and frequency of instanti- ation as determinants of graded structure in categories, Journal of Experimental Psychology: Learning, Memory, and Cognition 11, 629 (1985). [68] S. Ratneshwar, C. Pechmann, and A. D. Shocker, Goal-derived cate- gories and the antecedents of across-category consideration, Journal of Consumer Research 23, 240 (1996). [69] R. P. Bagozzi and U. M. Dholakia, Goal setting and goal striving in References 115

consumer behavior, Journal of Marketing 63, 19 (1999). [70] H. A. Simon, A behavioral model of rational choice, Quarterly Journal of Economics 69, 99 (1955). [71] J. G. March, Bounded rationality, ambiguity, and the engineering of choice, Bell Journal of Economics 9, 587 (1978). [72] E. C. Hirschman and M. B. Holbrook, Hedonic consumption: Emerg- ing concepts, methods, and propositions, Journal of Marketing 46, 92 (1982). 3 [73] G. L. Murphy and B. H. Ross, Use of single or multiple categories in category-based induction, in Inductive Reasoning: Experimental, De- velopmental, and Computational Approaches, edited by A. Feeney and E. Heit (Cambridge University Press, New York, NY, 2007) pp. 205–225. [74] G. L. Murphy and B. H. Ross, Uncertainty in category-based induction: When do people integrate across categories? Journal of Experimental Psychology: Learning, Memory, and Cognition 36, 263 (2010). [75] E. Konovalova and G. Le Mens, Feature inference with uncertain cat- egorization: Re-assessing Anderson’s rational model, Psychonomic Bulletin & Review, in press (2017). [76] P. Gärdenfors and F. Zenker, Conceptual spaces at work, in Applications of Conceptual Spaces: The Case for Geometric Knowledge Representa- tion, edited by P. Gärdenfors and F. Zenker (Springer, Berlin/Heidelberg, Germany, 2015) pp. 3–13. [77] G. Negro, M. T. Hannan, and H. Rao, Categorical contrast and audience appeal: Niche width and critical success in winemaking, Industrial and Corporate Change 19, 1397 (2010). [78] S. Verheyen, E. Ameel, and G. Storms, Determining the dimensionality in spatial representations of semantic concepts, Behavior Research Methods 39, 427 (2007). [79] R. F. J. Haans, C. Pieters, and Z.-L. He, Thinking about U: Theorizing and testing U- and inverted U-shaped relationships in strategy research, Strategic Management Journal 37, 1177 (2015). [80] L. W. Barsalou, Ad hoc categories, Memory & Cognition 11, 211 (1983). [81] G. Cattani, J. F. Porac, and H. Thomas, Categories and competition, Strategic Management Journal 38, 64 (2017). [82] Y.-N. Shih, R.-H. Huang, and H.-Y. Chiang, Background music: Eects on attention performance, Work: A Journal of Prevention, Assessment, & Rehabilitation 42, 573 (2012). [83] J. Edworthy and H. Waring, The eects of music tempo and loudness level on treadmill exercise, Ergonomics 49, 1597 (2007). [84] N. Anand and R. A. Peterson, When market information constitutes 116 Prototypes, Goals, and Cross-Classification

elds: Sensemaking of markets in the commercial music industry, Organization Science 11, 270 (2000). [85] P. J. DiMaggio, Classication in art, American Sociological Review 52, 440 (1987). [86] M. Montauti and F. C. Wezel, Charting the territory: Recombination as a source of uncertainty for potential entrants, Organization Science 27, 954 (2016). 3 [87] N. P. Mark, Culture and competition: Homophily and distancing ex- planations for cultural niches, American Sociological Review 68, 319 (2003). [88] N. P. Mark, Birds of a feather sing together, Social Forces 77, 453 (1998). [89] M. A. Glynn and M. Lounsbury, From the critics’ corner: Logic blending, discursive change, and authenticity in a cultural production system, Journal of Management Studies 42, 1031 (2005). [90] J. D. Smith and R. J. Melara, Aesthetic preference and syntactic proto- typicality in music: ’Tis the gift to be simple, Cognition 34, 279 (1990). [91] K. Hevner, Experimental studies of the elements of expression in music, American Journal of Psychology 48, 246 (1936). [92] R. E. Milliman, Using background music to aect the behavior of su- permarket shoppers, Journal of Marketing 46, 86 (1982). [93] R. E. Milliman, The inuence of background music on the behavior of restaurant patrons, Journal of Consumer Research 13, 286 (1986). [94] R. Yalch and E. Spangenberg, Eects of store music on shopping be- havior, Journal of Consumer Marketing 7, 55 (1990). [95] P. N. Juslin and J. A. Sloboda, Music and Emotion: Theory and Research (Oxford University Press, Oxford, United Kingdom, 2001). [96] Billboard, AllMusic.com folding into AllRovi.com for one-stop entertain- ment shop, https://www.billboard.com/biz/articles/news/1179058/ allmusiccom-folding-into-allrovicom-for-one-stop-entertainment- shop (2011), accessed 6 April 2017. [97] A. van Venrooij, The aesthetic discourse space of popular music: 1985– 86 and 2004–05, Poetics: Journal of Empirical Research on Culture, the Media, and the Arts 37, 315 (2009). [98] A. van Venrooij, A community ecology of genres: Explaining the emer- gence of new genres in the U.K. eld of electronic/dance music, 1985– 1999, Poetics: Journal of Empirical Research on Culture, the Media, and the Arts 52, 104 (2015). [99] D. Yang and W.-S. Lee, Music emotion identication from lyrics, in IEEE International Symposium on Multimedia, https://doi.org/10.1109/ISM. 2009.123 (2009), accessed June 18, 2017. References 117

[100] R. A. Peterson and D. G. Berger, Measuring industry concentration, diversity, and innovation in popular music, American Sociological Review 61, 175 (1996). [101] B. Kovács, G. R. Carroll, and D. W. Lehman, Authenticity and consumer value ratings: Empirical tests from the restaurant domain, Organization Science 25, 458 (2013). [102] S. W. Raudenbush and A. S. Bryk, Hierarchical Linear Models: Applica- tions and Data Analysis Methods (SAGE Publications, Thousand Oaks, 3 CA, 1992). [103] J. T. Lind and H. Mehlum, With or without U? The appropriate test for a U-shaped relationship, Oxford Bulletin of Economics and Statistics 72, 109 (2010). [104] L. S. Aiken and S. G. West, Multiple Regression: Testing and Interpreting Interactions (SAGE Publications, Thousand Oaks, CA, 1991). [105] D. A. Belsley, E. Kuh, and R. E. Welsch, Regression Diagnostics: Identi- fying Inuential Data and Sources of Collinearity (John Wiley & Sons, Hoboken, NJ, 1980). [106] S. M. Shugan, The cost of thinking, Journal of Consumer Research 7, 99 (1980). [107] P. Gärdenfors, Semantics, conceptual spaces, and the dimensions of music, in Essays on the Philosophy of Music, edited by V. Rantala, L. E. Rowell, and E. Tarasti (Philosophical Society of Finland, Helsinki, Finland, 1988) pp. 9–27. [108] N. Askin and M. Mauskapf, What makes popular music popular? Prod- uct features and optimal dierentiation in music, American Sociologi- cal Review 82, 910 (2017). [109] M. T. Hannan and J. Freeman, Structural inertia and organizational change, American Sociological Review 49, 149 (1984). [110] W. P. Barnett and E. G. Pontikes, The Red Queen, success bias, and organizational inertia, Management Science 54, 1237 (2008). [111] G. Le Mens, M. T. Hannan, and L. Pólos, Age-related structural inertia: A distance-based approach, Organization Science 26, 756 (2015).

II Logical Formalizations

119

4 Classification Systems as Concept Lattices

This chapter is based on a research paper written in collaboration with Willem Conradie, Sabine Frittella, Alessandra Palmigiano, Apostolos Tzimoulis, and Nachoem M. Wijnberg. An earlier version of this paper was presented at the 2016 Workshop on Logic, Language, Infor- mation, and Computation (Meritorious Autonomous University of Puebla) and published in Springer’s series Lecture Notes in Computer Science.

121 122 Classification Systems as Concept Lattices

4.1. Motivation Though the use of logical methods is perceived to be somewhat exotic in the social sciences [1], their application to organizational research is neither recent [e.g., 2] nor conned to the periphery of the discipline [3]. Many studies deployed logical formalization to access, repair, and improve the content of organizational theories: examples include Kamps and Pólos’ [4] reconstruction of Thompson’s classic propositions from Organizations in Action [5], and works by Hannan [6, 7], Péli [8, 9], and colleagues [10–12] in the context of organizational ecology. With respect to the research on categories, modal logic has been especially appreciated for its power to express the agents’ beliefs as well as their factual knowledge [13]. From 4 a purely mathematical perspective, much of the interdisciplinary success of modal logic is due to to the well-known and highly celebrated theory of correspondence [14, 15]. Indeed, the foundational results of this theory underpin the diusion of modalities in elds as diverse as computer science [16], articial intelligence [17], game theory [18, 19], and—most important to the aims of this chapter—economic sociology [20–22]. Correspondence theory originates from the observation that relational structures known as Kripke frames, which consist of tuples of sets and relations, serve as models for both rst-order sentences and modal for- mulas. A modal and a rst-order formula are said to correspond if they are valid in exactly the same class of Kripke frames. The correspondence theory developed by Sahlqvist [14] oers a method for computing the rst- order correspondent of certain special formulas, i.e., Sahlqvist formulas, and makes it possible to understand the “meaning” of a modal axiom in terms of the condition expressed by its rst-order correspondent. This is precisely the key that made modal logic an exceptionally intuitive tool: for instance, by allowing A → A to be understood as the reexivity axiom, and A → A as the transitivity axiom. In recent years, an encompassing perspective has emerged that, building on duality-theoretic insights [23], made it possible to export the state-of- the-art in Sahlqvist theory from the original context of modal logic to a wide spectrum of logics associated with algebras known as normal (dis- tributive) lattice expansions.1 In addition to intuitionistic and distributive lattice-based (normal modal) logics [24], this also extends to non-normal

1A normal (distributive) lattice expansion is a bounded (distributive) lattice endowed with operations of nite arity, where each coordinate is either positive, i.e., order-preserving, or negative, i.e., order-reversing. These operations are either nitely join-preserving (resp. meet-reversing) in their positive (resp. negative) coordinates, or nitely meet-preserving (resp. join-reversing) in their positive (resp. negative) coordinates. Motivation 123

(regular) modal logics [25], substructural logics [26], hybrid logics [27], and modal mu-calculi [28, 29]. Many applications were stimulated by this com- prehensive research program, some of which relate to the basic concerns of Sahlqvist theory, like the understanding of relationships between dier- ent methods for obtaining canonicity results [30–35]. Other applications concern the theory of nite lattices in universal algebra [36] and the theory of analytic calculi in structural proof theory [37, 38]. These results gave rise to a coherent framework termed unied correspondence [39]. The cornerstone of unied correspondence is the realization that the mechanisms underlying Sahlqvist’s correspondence [14] are algebraic, order- theoretic, and duality-theoretic. Just like the Sahlqvist theory for modal logic builds on the duality between Kripke frames and their associated 4 algebras, correspondence theory for lattice-based modal logic builds on the duality between perfect lattices and RS-polarities, rst suggested by Birkho [40] and later discussed by Gehrke [41]. Recent research [26] has shown that this duality can be expanded so as to add normal modal opera- tors on the side of the algebras and relations on the side of the polarities, thereby obtaining what we refer to as RS-frames. While this theory works excellently from a mathematical perpective, the resulting correspondences have proven dicult to understand intuitively. In this chapter, we propose a novel and intuitive interpretation of these mathematical results using the notion of categories or concepts as they are studied in psychology [42] and organization theory [13]. Our interpretation pivots on the dual view of categories as sets of objects and sets of features proposed by Ganter and Wille [43, 44] within the framework of Formal Concept Analysis (FCA). From this original perspective, the normal modal operators acquire a nat- ural epistemic interpretation that captures important properties like the factivity and the positive introspection of knowledge. The link between RS-frames and FCA is given by the fact that RS-frames arise from polarities [41], i.e., tuples (A, X , I ) such that A and X are sets, and I ⊆ A×X , and that polarities can be interpreted as formal contexts [44], which consist of objects a ∈ A, features x ∈ X , and a relation I connecting every object with the features it possesses. As noted by Birkho [40], any polarity induces a Galois connection between the powersets of A and X , the stable sets of which form a complete lattice. Indeed, by Birkho’s representation theorem, any complete lattice is isomorphic to one arising from some polarity. This representation theory for general lattices provides the polarity-to-lattice direction of the duality discussed in this chapter, and it is also at the heart of FCA, because the Galois-stable sets arising from formal contexts can be interpreted as formal concepts or categories. One 124 Classification Systems as Concept Lattices

of the most felicitous insights of FCA is that formal concepts are endowed by construction with a double interpretation: an extensional one, specied by the objects that are instances of the concept, and an intensional one, specied by the features shared by these objects. While the logical contribution of our proposal is to attain an intuitive grasp of a class of mysterious models with excellect mathematical proper- ties, the conceptual connection we establish can be used to better under- stand certain aspects of categorization in markets. In economic sociology, categories constitute collective identities for groups of individuals or orga- nizations, such as comedy actors [45] or nouvelle cuisine restaurants [46], as well as sets of products with some distinguishing characteristics, like 4 biodynamic wines [47] and light cigarettes [48]. The labels associated with these categories [49] are essential to economic decision-making because they induce expectations and default beliefs about the features of objects [13], on which boundedly rational agents like investors, consumers, and rm managers base their choices. This view is consistent with cognitive- psychological research on categorization (cf. Chapter3), according to which categories are cognitive sieves that preserve relevant distinctions and sup- press whatever information is deemed inconsequential or redundant [50]. Taken together, categories form classication systems, which consist of multiple levels of abstraction [51]. This agrees with the FCA treatment, ac- cording to which concepts arise embedded in their concept lattice. The extensional and intensional interpretations of concepts in FCA nd a very suitable ground for application because categories can be equivalently dened as groups of objects, which represent the category members, or as lists of features, which represent the membership requirements. As the organizational research on categories matured and progressed, organization scholars became increasingly interested in the more dynamic aspects of categorization [52]. Considerable attention has been devoted to the processes whereby new market categories are born, either ex ni- hilo or through the recombination of existing features [53], and to the role played by agents’ communication in these dynamic processes [54]. This is important because, even though market categories arise from factual information about products or organizations, a critical component of their nature cannot be reduced to sheer truth. Like all categories, market cate- gories are ultimately social constructs [55–57], and reasoning about them requires a peculiar combination of factual truth, subjective perception, and social interaction. This tripartition is the centerpiece of our formal proposal, which contributes to the eld of organization theory precisely by formalizing the objective, subjective, and social aspects of categorization. Motivation 125

In this chapter, we develop a formal theory where the agents are allowed to entertain idiosyncratic beliefs about the objects’ category memberships [cf. 58, 59]. We accomplish this by associating each agent with a binary relation R ⊆ A × X on the polarity (A, X , I ). Intuitively, the incidence relation I of the polarity is taken to represent factual relationships between objects and features, whereas the agent-specic relation R represents the subjective perception of this information, which can be partial or even grossly mistaken. For every object a ∈ A and every feature x ∈ X , we read aRx as “object a has feature x according to the agent.” By general order- theoretic facts, these relations induce normal modal operators that have a natural epistemic interpretation: namely, ϕ denotes the category ϕ as this is understood or perceived by the agent. In this new logical language, it is 4 easy to distinguish between factual information, encoded by the formulas of the modal-free fragment of the logic, and its subjective interpretation, encoded by the formulas where the modal operators occur. This language is expressive enough to capture agents’ beliefs, but also their beliefs about the beliefs of other agents, and so forth. We dene xed points of these iterations in a way similar to how common knowledge is de- ned in classical epistemic logic [60]. In our theory, these points represent convergence in a process of social interaction: for example, the consensus reached by a group of agents vis-à-vis the objects that have certain features and thus belong to a particular category. The addition of new objects or new features to the market can destabilize this consensus, triggering a new round of interaction that ultimately begets a new equilibrium. Hence, we trace the origin of classication systems both to factual information about the objects on the market and the features they possess, which can be updated via the addition of new elements to A and X , and to the agents’ idiosyncratic apprehension of this information, which can change even if the elements of A and X remain the same. Our exposition of this formal theory is structured as follows: In Section 4.2, we recall some preliminaries about perfect lattices, RS-polarities, gener- alized Kripke frames, and Formal Concept Analysis. In addition, we provide a brief technical explanation of the dual interpretation of RS-semantics. We assume familiarity with the basics of lattice theory [61]. In Section 4.3, we discuss how our algebraic semantic structure can be understood in terms of categories and classication systems. Further, we show that the normal modal operators on lattices can support an epistemic interpretation. We build on this interpretation to introduce a common knowledge-type construction that accounts for the emergence of categories by way of social interaction. In Section4.4, we discuss the strengths of our logical framework, 126 Classification Systems as Concept Lattices

propose its application to other elds of research, and describe possible extensions to be pursued in future study.

4.2. Preliminaries 4.2.1. Perfect Lattices and Birkhoff’s Theorem A bounded lattice Ì = (L, ∧, ∨, 0, 1) is complete if all subsets S ⊆ L have both a supremum Ô S and an inmum Ó S. An element a in Ì is completely join-irreducible if, for any S ⊆ Ì, a = Ô S implies a ∈ S. Complete meet- irreducibility is dened order-dually. The sets of completely join- and meet- irreducible elements of Ì are denoted by J ∞(Ì) and M ∞(Ì), respectively. 4 A complete lattice Ì is perfect if it is join-generated by its completely join-irreducibles, and meet-generated by its completely meet-irreducibles. That is, Ì is perfect if for any u ∈ Ì we have:

Ü ∞ Û ∞ {j ∈ J (Ì) | j 6 u} = u = {m ∈ M (Ì) | u 6 m} . (4.1)

Denition 4.2.1. A polarity is a triple Ð = (A, X , I ) where A and X are sets, and I ⊆ A × X is a relation. For every polarity Ð, we dene the functions (·)↑ (upper) and (·)↓ (lower)2 between the posets (P(A), ⊆) and (P(X ), ⊆) as:

↑ for U ∈ P(A) let U B {x ∈ X | [a (a ∈ U → aI x)} , (4.2) ↓ for V ∈ P(X ) let V B {a ∈ A | [x (x ∈ V → aI x)} . (4.3)

The two maps (·)↑ and (·)↓ form a Galois connection between (P(A), ⊆) and (P(X ), ⊆), i.e., V ⊆ U ↑ i U ⊆ V ↓ for all U ∈ P(A) and V ∈ P(X ). This connection has important and well-known consequences, including:

(a) The composition maps (·)↑↓ B (·)↓ ◦(·)↑ and (·)↓↑ B (·)↑ ◦(·)↓ are closure operators on (P(A), ⊆) and (P(X ), ⊆), respectively.3

(b) The set of all Galois-stable subsets of A, i.e., those U ∈ P(A) such that U ↑↓ = U , forms a complete sub-semilattice of (P(A), Ñ). Likewise, the set of all Galois-stable subsets of X , i.e., those V ∈ P(X ) such that V ↓↑ = V , forms a complete sub-semilattice of (P(X ), Ñ). We denote this semilattice by Ð+.

↑ 2In what follows, we simplify notation wherever possible and write a ↑ for {a} and x ↓ for ↓ {x } for every a ∈ A and x ∈ X . 3Recall that a closure operator on a poset (S, 6) is a map f : S → S, which is extensive ([a ∈ S[a 6 f (a)]), monotone ([a, b ∈ S[a 6 b ⇒ f (a) 6 f (b)]), and idempotent ([a ∈ S[f (a) = f (f (a))]). Preliminaries 127

(c) Because it is complete, the semilattice Ð+ is in fact a lattice, where meet is the set-theoretic intersection and join is the closure of the set-theoretic union.

Birkho [40] showed that every complete lattice is isomorphic to Ð+ for some polarity Ð. In the terminology of FCA, this complete lattice represents the concept lattice arising from Ð, i.e., all the tuples (C, D) such that C ⊆ A, D ⊆ X , D ↓ = C , and C ↑ = D. The concepts, i.e., the Galois-stable subsets of X and A, can be characterized as (members of) tuples (U ↑↓,U ↑) and (V ↓,V ↓↑) for any U ⊆ A and V ⊆ X . The sets C and D are respectively referred to as the “extension” and the “intension” of a concept. A polarity (A, X , I ) induces specialization pre-orders on A and X dened 4 as follows: x 6 y i [a(aI x → aI y ) for all x, y ∈ X , and a 6 b i [x(bI x → aI x) for all a, b ∈ A. Clearly, 6 ◦ I ◦ 6 ⊆ I . For every b ∈ A and z ∈ X , let z↑ B {x | z 6 x }, and b↓ B {a | a 6 b}. Lemma 4.2.2. z↑ and b↓ are Galois-stable for all b ∈ A and z ∈ X .

Proof. As the two parts of the proof are symmetric, we only prove the part ↓↑ concerning z. Let x ∈ z↑ and let us show that z 6 x. That is, let us x a such that aI z and show that aI x. Because I ◦ 6 ⊆ I , from aI z it follows ↓ ↓↑ that [y (z 6 y → aI y ), which means that a ∈ z↑ . Because x ∈ z↑ by assumption, this implies aI x, as required.  Corollary 4.2.3. z ↓↑ = z↑ and b ↑↓ = b↓ for all b ∈ A and z ∈ X .

Proof. Because z↑ is Galois-stable and contains z, and z ↓↑ is the smallest such set by denition, it follows that z ↓↑ ⊆ z↑. For the converse inclusion, let z 6 y and aI z. Because I ◦ 6 ⊆ I , this implies that aI y and thus that y ∈ z ↓↑, as required. 

In summary, the formal concepts generated by each a ∈ A and x ∈ X are dened as (a↓, a ↑) and (x ↓, x↑), respectively.

4.2.2. Duality with RS-Polarities As mentioned above, every complete lattice is isomorphic to Ð+ for some polarity Ð. When specializing to distributive lattices and Boolean algebras, well-known dualities exist between set-theoretic structures and perfect algebras. In particular, perfect distributive lattices are dual to posets, and perfect (i.e., complete and atomic) Boolean algebras are dual to sets. The question arises of which polarities are dual to perfect lattices. Gehrke [41] oered an answer in the form of reduced and separated polarities, or 128 Classification Systems as Concept Lattices

RS-polarities, by rephrasing the duality for perfect lattices [62] in a model- theoretic fashion. In this section, we recall what it means for a polarity to be reduced and separated, and briey explain how these two properties guarantee the perfection of the dual lattice. The route from perfect lattices to polarities is given by the following: Denition 4.2.4. For every perfect lattice Ì, the polarity associated with Ì is ∞ ∞ the triple Ì+ B (J (Ì), M (Ì), I+) where I+ is the lattice order 6Ì restricted to J ∞(Ì) × M ∞(Ì).

Denition 4.2.5. [cf. 41, Denitions 2.3 and 2.12] A polarity Ð = (A, X , I ) is

4 1. separated, if the following conditions are satised:

(S1) for all a, b ∈ A, if a , b then a ↑ , b ↑, (S2) for all x, y ∈ Y , if x , y then x ↓ , y ↓; 2. reduced, if the following conditions are satised:

(R1) for every a ∈ A, some x ∈ X exists such that a is 6-minimal in {b ∈ A | (b, x) < I }, (R2) for every x ∈ X , some a ∈ A exists such that x is 6-maximal in {y ∈ X | (a, x) < I }; 3. an RS-polarity, if it is reduced and separated.4

Let us denote S B {b | b ∈ A and b < a} = a↓ \ {a} for each a ∈ A. If Ð is + Ô separated, then a↓ is completely join-irreducible in Ð i b ∈S b↓ a↓ i ↑ Ñ ↑ a b ∈S b , that is, i some x ∈ X exists such that bI x for all b ∈ S and (a, x) < I . This corresponds to the rst reducing condition (R1). Similarly, the second reducing condition (R2) dually characterizes the notion that, for every x ∈ X , the subset x↑ is completely meet-irreducible in Ð+, represented as a sub meet-semilattice of P(X ). Proposition 4.2.6. [cf. 62, Proposition 4.7 and Corollary 4.9] For every perfect lattice Ì and RS-polarity Ð:

+ 1. Ì+ is an RS-polarity and (Ì+)  Ì,

+ + 2. Ð is a perfect lattice and (Ð )+  Ð. 4Gehrke [41] refers to RS-polarities as RS-frames. In this chapter, we distinguish between these two terms and reserve RS-frames for RS-polarities endowed with additional relations used to interpret operations on the lattice expansion. Preliminaries 129

This duality serves to generalize the Kripke semantics of modal logic to logics with possibly non-distributive propositional base. As in the dual correspondence between Kripke frames and complete atomic Boolean algebras with operators (BAOs), one would want a dual correspondence between perfect normal lattice expansions and RS-polarities endowed with additional relations. In previous work, Conradie and Palmigiano [26, Section 2] introduced a method for computing the denition of relations dually corresponding to normal modal operators for a certain modal signature consisting of unary and binary modal operators. We apply this method here to obtain an expansion L of the basic lattice language with a unary normal box-type connective , which is canonically interpreted on lattices endowed with a completely meet-preserving operation. 4 Taking the connection between the satisfaction relation in Kripke frames and the interpretation of modal formulas in BAOs as our guideline, let ¿ = (W , R) be a Kripke frame. From the satisfaction relation ⊆ W ×L be- tween states of ¿ and formulas, we dene an interpretation v : L → ¿+ into the complex algebra of ¿. This is an L-homomorphism, and it is obtained as the unique homomorphic extension of the equivalent functional represen- tation of the relation as a map v : Prop → ¿+, dened as v(p) = −1[p].5 This makes it possible to derive interpretations from satisfaction relations, so that for any a ∈ J ∞(¿+) and any formula ϕ,

a ϕ i a 6 v (ϕ) , (4.4) where, on the left-hand side of the condition, a ∈ J ∞(¿+) is identied with + a state of ¿ via the isomorphism ¿  (¿ )+. Conversely, consider a perfect lattice with completely meet-preserving operation ¼ = (Ì, ) and a homomorphic assignment v : L → ¼. Recall that the complete lattice Ì can be identied with the lattice Ð+ arising from some RS-polarity Ð = (A, X , I ). We want to dene a suitable relation R = R and satisfaction relation v that fullls Condition 4.4. Our method hinges on the dual characterization of v as a pair of relations ( v , v ) such ∞ ∞ that v ⊆ J (Ì) × L  A × L and v ⊆ M (Ì) × L  X × L. This dual characterization is established by induction on formulas. The base of the

5Notice that, in order for this equivalent functional representation to be well dened, we are required to assume that the relation is ¿+-compatible, i.e., that −1[p] ∈ ¿+ for every p ∈ Prop. In the Boolean case, every relation from W to LML is clearly ¿+-compatible, but this is not necessarily so in the distributive case because −1[p] needs to be an upward- or downward-closed subset of ¿. This gives rise to the persistency condition, e.g., in the relational semantics of intuitionistic logic. 130 Classification Systems as Concept Lattices

induction is clear: for every a ∈ J ∞(Ð+) and every p ∈ Prop ∪ {0, 1}, we have

a v p i a 6 v (p) . (4.5)

Let us now turn to the inductive step for the box. Because v : L → Ð+ + is a homomorphism, v(ϕ) = Ð v(ϕ). Suppose that Condition 4.4 holds for ϕ. Because Ð+ is perfect, v(ϕ) = Ó{x ∈ M ∞(Ì) | v(ϕ) ≤ x }. Therefore,

+ a 6 v (ϕ) i a 6 Ð v (ϕ) (4.6) Ð+ Û  ∞ + i a 6  x ∈ M Ð | v (ϕ) ≤ x (4.7) Û n Ð+ ∞ + o 4 i a 6  x | x ∈ M Ð and v (ϕ) 6 x (4.8) h ∞ + i i [x (x ∈ M (Ì) and v (ϕ) 6 x) → a 6 Ð x . (4.9)

At the end of this chain, we have equivalently reduced the whole information + on  to the information of whether a 6 Ð x for each a and x. Consequently, this can be taken as the denition of the relation R ⊆ A × X : we let aRx i + a 6 Ð x. To turn the last clause above into a satisfaction clause for , we rst ∞ + replace M (Ì) with X , which we identify via the isomorphism Ð  (Ð )+. Then, we recall the second relation v between elements of X and formulas obeying the following condition, which is to be dened by induction on the structure of the formulas, analogously to Condition 4.4:

x ϕ i v (ϕ) 6 x. (4.10)

These considerations produce the following satisfaction clause for :

a v ϕ i a 6 v (ϕ) (4.11)

i [x [(x ∈ X and x ϕ) → aRx] . (4.12)

The co-satisfaction relation deserves some additional comments. In the Boolean and distributive settings, is completely determined by and hence it is not mentioned explicitly. In the non-distributive setting, however, the relation needs to be dened along with . Condition 4.10 determines the base case:

y v (p) i v (p) 6 y . (4.13)

If we specialize the clause above to powerset algebras P(W ), then we have y V p i V (p) 6 y i V (p) ⊆ W /{x } for some x ∈ W i {x } * V (p) i Preliminaries 131 x < V (p) i x 1 p. This goes to show that the relation can be regarded as an upside-down description of the satisfaction relation , which we refer to as a co-satisfaction or refutation. The inductive step for the derivation of the co-satisfaction clause for  is: Ü ∞ v (ϕ) 6 x i {a ∈ J (Ì) | a 6 v (ϕ)} 6 x (4.14) ∞ i [a [(a ∈ J (Ì) and a 6 v (ϕ)) → a 6 x] (4.15) i [a [(a ∈ A and a ϕ) → aI x] . (4.16)

The last line follows from Condition 4.4 for ϕ, from the identication of ∞ + J (Ì) with A via the isomorphism Ð  (Ð )+, and from the identication of the lattice order 6 restricted to J ∞(Ì) × M ∞(Ì) with the relation I . 4

4.2.3. RS-Frames and Models In addition to the semantics for  discussed above, we dene relational semantics for a further expansion of L with a unary normal diamond-type connective _ and with two special sorts of variables i, j, termed nominals, and m, n, termed co-nominals. These semantics are the outcome of a dual characterization similar to the one used for . Denition 4.2.7. An RS-frame for an expansion L of the basic lattice lan- guage is a structure ¿ = (Ð, R), where Ð = (A, X , I ) is an RS-polarity and R ⊆ A × X so that the (pre-)images of singletons under R are Galois-closed, i.e., for every x ∈ X , R −1[x]↑↓ ⊆ R −1[x], where R −1[x] B {a | aRx }, and for every a ∈ A, R[a]↓↑ ⊆ R[a], where R[a] B {x | aRx }. The relations R satisfying this condition are termed RS-compatible.

The additional conditions on R are compatibility conditions, which guarantee that the following assignments respectively dene the operations  and _ associated with R on the lattice Ð+. Thus, for every U ∈ Ð+, we have: Ù − ↓ U B {R 1[x] | U ⊆ x }, (4.17) Ü ↑↓ _U B {R[a] | a ⊆ U }. (4.18)

Denition 4.2.8. For every RS-frame ¿ = (Ð, R), its complex algebra is the lattice expansion ¿+ B (Ð+, ) where  is dened as above. Lemma 4.2.9. 6 ◦ R ◦ 6 ⊆ R for every RS-frame ¿ = (Ð, R).

Proof. Assume that aR z and z 6 y . To show that y ∈ R[a], by the second compatibility condition it is enough to show that y ∈ R[a]↓↑. That is, let 132 Classification Systems as Concept Lattices

Table 4.1: Satisfaction and co-satisfaction relations on Í

Í, a 0 never Í, x 0 always Í, a 1 always Í, x 1 never

Í, a p i a ∈ V1(p) Í, x p i x ∈ V2(p) Í, a i i a ∈ V1(i) Í, x i i x ∈ V2(i) 4 Í, a m i a ∈ V1(m) Í, x m i x ∈ V2(m) Í, a ϕ ∧ ψ i Í, a ϕ and Í, a ψ Í, x ϕ ∧ ψ i for all a ∈ A, if Í, a ϕ ∧ ψ, then aI x Í, a ϕ ∨ ψ i for all x ∈ X , if Í, x ϕ ∨ ψ, then aI x Í, x ϕ ∨ ψ i Í, x ϕ and Í, x ψ Í, a ϕ i for all x ∈ X , if Í, x ϕ, then aRx Í, x ϕ i for all a ∈ A, if Í, a ϕ, then aI x Í, a _ϕ i for all x ∈ X , if Í, x _ϕ, then aI x Í, x _ϕ i for all a ∈ A, if Í, a ϕ, then aRx

us x b ∈ R[a]↓ and show that bI y . From b ∈ R[a]↓ and aR z it follows that bI z. Given that I ◦ 6 ⊆ I , bI z and z 6 y imply that bI y . The rest is proven in a similar fashion. 

An RS-model for L on ¿ is a structure Í = (¿, v) such that ¿ is an RS- frame for L and v is a variable assignment mapping each p ∈ Prop to a pair (V1(p),V2(p)) of Galois-stable sets in P(A) and P(X ), respectively. Given a model for the expanded language with _, nominals, and co-nominals, the variable assignments also map nominals j to (j ↑↓, j ↑) for some j in A, and co-nominals m to (m ↓, m ↓↑) for some m in X . Table 4.1 presents the recursive denitions of the satisfaction and co-satisfaction relations on Í. The following lemma is easily proven by simultaneous induction on ϕ and ψ using these denitions. The base cases for 0 and 1 use conditions (R1) and (R2), whereas those for proposition letters, nominals, and co-nominals follow from the way valuations are dened.

Lemma 4.2.10. For all formulas ϕ and ψ, it holds that: Preliminaries 133

1. Í, a ϕ i for all x ∈ X , if Í, x ϕ then aI x,

2. Í, x ψ i for all a ∈ A, if Í, a ψ then aI x.

An inequality ϕ 6 ψ is true in Í, denoted by Í ϕ 6 ψ, i, for all a ∈ A and all x ∈ X , if Í, a ϕ and Í, x ψ then aI x. Remark 4.2.11. It follows from Lemma 4.2.10 that Í ϕ 6 ψ i, for all a ∈ A, if Í, a ϕ then Í, a ψ. It also follows that Í ϕ 6 ψ i, for all x ∈ X , if Í, x ψ then Í, x ϕ. We will nd these equivalent characterizations of truth in Í useful when discussing examples.

4.2.4. Standard Translation 4 As in the Boolean case, each RS-model Í for L can be viewed as a rst- order structure, albeit two-sorted. Accordingly, we dene correspondence languages. Let L1 be the two-sorted rst-order language with equality built over the denumerable and disjoint sets of individual variables A and X, with the binary relation symbol I , R, and two unary predicate symbols P1, P2 for each p ∈ Prop. The intended interpretation links P1 and P2 in the way suggested by the denition of L-valuations. Every p ∈ Prop maps to a pair (V1(p),V2(p)) of Galois-stable sets, as explained in Section 4.2.3. The interpretation of pairs (P1, P2) of predicate symbols is thus restricted to pairs of Galois-stable sets, and the interpretation of universal second-order quantication is also restricted to range over such sets. We assume that L1 contains denumerably many individual variables i, j ,..., which correspond to the nominals i, j,... ∈ Nom, and denumerably many individual variables n, m,..., which correspond to the co-nominals n, m,... ∈ CoNom. Let L0 be the sub-language that does not contain the unary predicate symbols corresponding to the propositional variables. Ta- ble 4.2 presents the recursive denitions of the standard translation of L+ into L1. In reading the table, recall that a 6 j abbreviates [x(j I x → aI x) and m 6 x abbreviates [a(aI m → aI x). Lemma 4.2.12. [cf. 26, Lemma 2.5] For any L-model Í and any L+-inequality ϕ 6 ψ, it holds that

Í ϕ 6 ψ i Í |= [a[x [STa (ϕ) ∧ STx (ψ) → aI x] (4.19) i Í |= [a [STa (ϕ) → STa (ψ)] (4.20) i Í |= [x [STx (ψ) → STx (ϕ)] . (4.21)

By virtue of this translation, RS-frames provide an algebraically motivated generalization of correspondence theory. As anticipated at the beginning 134 Classification Systems as Concept Lattices

Table 4.2: Standard translation on RS-frames

STa (0) B a , a STx (0) B x = x STa (1) B a = a STx (1) B x , x

STa (p) B P1(a) STx (p) B P2(x) STa (j) B a 6 j STx (j) B j I x 4 STa (m) B aI m STx (m) B m 6 x

STa (ϕ ∧ ψ) B STa (ϕ) ∧ STa (ψ) STx (ϕ ∧ ψ) B [a[STa (ϕ ∧ ψ) → aI x] STa (ϕ ∨ ψ) B [x[STx (ϕ ∨ ψ) → aI x] STx (ϕ ∨ ψ) B STx (ϕ) ∧ STx (ψ)

STa (ϕ) B [x[STx (ϕ) → aRx] STx (ϕ) B [a[STa (ϕ) → aI x] STa (_ϕ) B [x[STx (_ϕ) → aI x] STx (_ϕ) B [a[STa (ϕ) → aRx]

of this chapter, one of the objectives of our study is to understand whether such generalized environment retains some of the intuition that made Kripke semantics and modal logic so appealing and well-suited for a num- ber of interdisciplinary applications, including the study of categories in organization theory [13]. Let us start with the inequality 0 6 0, which corresponds on Kripke frames to the condition that every state has a successor.

0 6 0 i [a [STa (0) → [x (STx (0) → aI x)] (4.22) i [a [[y (y = y → aR y ) → [x (x = x → aI x)] (4.23) i [a [[y (aR y ) → [x (aI x)] (4.24) i [a\y (¬ (aR y )) . (4.25)

To justify the last equivalence recall that, by denition, no object a veries [x(aI x) in an RS-polarity (cf. Denition 4.2.5). Therefore, the condition in the penultimate line is true precisely when the premise of the implication, namely [y (aR y ), is false. This means that every state is not R-related to Preliminaries 135 some co-state. The condition on Kripke frames is recognizable given the suitable insertion of negations. Next, let us consider the inequality p 6 p, which corresponds on Kripke frames to the condition that R is reexive.

[p (p 6 p) i [m (m 6 m) (4.26) i [a[m [STa (m) → STa (m)] (4.27) i [a[m (aRm → aI m) . (4.28)

By denition, STa (m) = aI m and STa (m) = [y (m 6 y → aR y ) can be rewritten as m↑ ⊆ R[a], which is equivalent to aRm because R ◦ 6 ⊆ R 4 (cf. Lemma 4.2.9). To recognize the connection with the usual reexivity condition, observe that [a[m(aRm → aI m) is equivalent to R ⊆ I , and the reexivity of a relation R ⊆ A × A can be written as Id B ⊆ R, where for c c every A, IdA B {(a, a) | a ∈ A}, which is equivalent to R ⊆ Id . Clearly, p 6 p implies p 6 p. Let us then consider the converse inequality, which in the classical setting corresponds to transitivity:

[p (p 6 p) i [m (m 6 m) (4.29) i [a[m (STa (m) → STa (m)) (4.30)  − ↑  i [a[m aRm → R 1 [m] ⊆ R [a] . (4.31) where   STa (m) = [y STy (m) → aR y (4.32) = [y [[b (STb (m) → bI y ) → aR y ] (4.33) = [y [[b (bRm → bI y ) → aR y ] (4.34) − ↑ = R 1 [m] ⊆ R [a] . (4.35)

Although it is possible to retrieve the transitivity condition in this new inter- pretation, already with a relatively simple inequality like p 6 p this is hardly useful to gain a better understanding of this semantics, because the accessibility relation on states is encoded here into a “non-inaccessibility” relation between states and co-states. As a result, the condition quickly becomes awkward and unintuitive. In the next section, we propose a con- ceptual interpretation grounded in organization theory and show that better results can be achieved by taking this condition as primitive rather than as the generalization of some other semantics. 136 Classification Systems as Concept Lattices

4.3. Application to Organization Theory 4.3.1. Categorization via RS-Semantics At rst sight, the mathematical framework presented above may seem far too abstract to capture notions of practical relevance to the sociology of markets: however, this framework can be easily understood by organization scholars as the order-theoretic representation of a classication system. The value of this abstraction goes beyond the mere appreciation of formal theory—though this too can sometimes be helpful [1, 3]—because thinking of classication systems as RS-frames allows one to better understand how market categories arise from a mix of factual information, subjective beliefs, and interaction the agents that populate the market. 4 The cornerstone of this sociological application of RS-frames is the notion, also core to FCA, that polarities (A, X , I ) can be taken to represent databases where A is a set of objects, such as products or organizations in a market, X is a set of features that the agents nd relevant to categorization, and I encodes whether a particular object possesses a particular feature. The specialization pre-order on objects a 6 b can thus be interpreted as “a is at least as specied as b,” meaning that a has at least all the feature of b. The pre-order on features x 6 y , instead, can be interpreted as “y is at least as generic as x,” meaning that any object with x is bound to have y as well. With this interpretation in mind, the reducing and separating conditions (cf. Denition 4.2.5) that Ð must fulll in order to dually correspond to a perfect lattice Ð+ can be rephrased as follows:

(S1) any two objects can be told apart by some feature;

(S2) for any two features, there is an object having one but not the other;

(R1) for any object a, if there are strictly more specied objects than a (i.e., objects that have all the features of a but also some more), then all of these objects share some feature x which a does not have;

(R2) for any feature x, if there are strictly more generic features than x (i.e., features shared by all the objects with x but also by others), then some object a exists that has all of these features but not x.

Conditions (S1) and (S2) are intuitive and do not require much expla- nation. They will be satised as long as no two objects have exactly the same features and no two features are shared by exactly the same objects. Condition (R1) can be enforced by adding ad hoc features to the database. For example, consider a market for songs with the following three features: Application to Organization Theory 137 x B “composed in D minor,” y B “sung in mezzo-soprano,” and z B “lyrics with political undertones.” Suppose that three songs a, b, c share feature x but b also has y and c also has z. Such a database violates (R1), because b and c are strictly more specied than a but they do not have features in common that a does not also possess. This can be remedied by adding a feature w B “cover art dierent from a”. Although this feature may be utterly irrelevant to the agents, it does encode a factual truth about the objects and thus it can be safely added to the database. In other words, its relationships with the objects are encoded by I but they need not be encoded by R. Finally, (R2) can be enforced by removing particular features when they are the intersection of two, more generic features: for example, if the database included x B “composed in D minor,” y B “sung in mezzo- 4 soprano,” and z B “composed in D minor and sung in mezzo-soprano.” In this case, (R2) would be violated because no object exists that has both x and y while not having z. This can be easily remedied by excluding z from the database. Removing such features can always be done without loss of descriptive power: in point of fact, we can always enforce the separating and reducing conditions because the nite polarities we consider are a subclass of doubly founded polarities, for which this operation is always possible [43]. In Chapter5, we will remove the RS-conditions so as to ac- commodate any polarity, but for the moment we restrict our consideration to those where these conditions are enforced. Figure 4.1 provides a visual example of a small database that abides by the RS-conditions, as well as the corresponding RS-polarity and perfect lattice. We propose to understand the lattice Ð+ generated by the RS-polarity Ð as the partially ordered collection of all the candidate categories in the market. Formally, each element of Ð+ is a set of objects that is completely identied by a set of features: any object with these features is a member of the candidate category. We refer to these categories as “candidate” because they are purely implicit in the database and do not necessarily enjoy social recognition. Only a very limited subset of candidate categories will support the interpretation of real categories, i.e., those that are actually deemed meaningful by agents.6 Only these categories, which of course are much fewer than the set of all candidate categories the agents could use to sort objects and features, will be assigned a category label.

6Notice that our denition of “real” categories is not the same as the one given by Hsu and Hannan [55, p. 478] for “sociologically real categories.” Sociologically real categories are those where membership has tangible consequences in terms of competitive outcomes [e.g., 47]. The categories we dene as “real” are merely those whose intension and extension are agreed upon by the agents. In Chapter5, we will further rene this denition. 138 Classification Systems as Concept Lattices

x y z a • b •• c • d ••

4 x y z

a b d c

({a, b, c, d}, ∅)

({a, b}, {x}) ({c, d}, {z})

({b, d}, {y }) ({b}, {x, y }) ({d}, {y, z})

(∅, {x, y, z})

Figure 4.1: Database, RS-polarity, and perfect lattice Application to Organization Theory 139

The labels of real categories can be attached to candidate categories by means of an assignment v that links each atomic category label p ∈ Prop to a category viewed both extensionally as V1(p) ⊆ X and intensionally as 7 V2(p) ⊆ A. Notice the perfect match between the encoding of the meaning of atomic propositions on Kripke models and that of atomic category labels on RS-models: on Kripke models, the meaning of atomic proposition p is given as the set of states at which p holds true; on RS-models, the meaning of atomic category label p is given as the set of objects that constitute the extension of p, or equivalently, as the set of features that constitute the intension of p. For convenience, we refer to a category’s intension as its description, and we say that a feature describes a category if it belongs to the category’s description. Given this assignment, the database is endowed 4 with the structure of an L-model Í in such a way that, for any formula or category label ϕ ∈ L, any object a ∈ A, and any feature x ∈ X , the expressions Í, a ϕ and Í, x ϕ respectively read as “object a is a member of category ϕ” and “feature x describes category ϕ.” One advantage of this conceptualization is that it provides an intu- itive way to understand from rst principles rather than as the negative counterpart of . Another advantage concerns the understanding of the connectives ∧ and ∨ in the general lattice environment. These operators are useful to identify categories that result from conceptual combinations [63–66]: there is a problem, however, in that their standard interpretation as conjunction and disjunction is unintuitive because distributivity seems to be hardwired in the way we understand “and” and “or” in natural language. In our framework, the satisfaction and co-satisfaction clauses for ∧ and ∨ formulas are as follows (cf. Table 4.1):

Í, a ϕ ∧ ψ i Í, a ϕ and Í, a ψ (4.36) Í, x ϕ ∧ ψ i for all a ∈ A, if Í, a ϕ ∧ ψ then aI x (4.37) Í, a ϕ ∨ ψ i for all x ∈ X , if Í, x ϕ ∨ ψ then aI x (4.38) Í, x ϕ ∨ ψ i Í, x ϕ and Í, x ψ. (4.39)

These denitions imply that the category ϕ ∧ ψ is the one whose exten- sion corresponds to the intersection of the extensions of ϕ and of ψ. In other words, the members of ϕ ∧ ψ are the objects that belong both to ϕ and to ψ. These satisfy at least the descriptions of ϕ and of ψ, and therefore the description of ϕ ∧ ψ contains at least the union of the two descriptions, but it will usually contain additional features. For example, the category black cars ∧ luxury cars includes all the cars that qualify as black and

7 ↓ ↑ Recall that, for such an assignment, V1(p) = V2(p) and V2(p) = V1(p) . 140 Classification Systems as Concept Lattices

luxury. However, suppose that every member of black cars ∧ luxury cars has leather seats, which is neither the case for every black car nor for every luxury car: in this case, having leather seats will be part of the description of black cars ∧ luxury cars and hence this description is strictly larger than the union of the descriptions of black and of luxury. Conversely, the category ϕ ∨ ψ is the one whose description is equal to the intersection of the descriptions of ϕ and of ψ. Because in this case the objects are only required to possess the features common to ϕ and to ψ, the extension of ϕ ∨ ψ will include at least the union of the two extensions, but it will usually contain additional objects. For example, the category black cars ∨ luxury cars includes cars of other colors in addition 4 to black, because color is not a dening feature of luxury cars, and of other vehicle classes in addition to luxury (e.g., sedan, economy, o-road), because pertaining to a particular vehicle class is not a dening feature of black cars. That is, this category includes virtually every car. This interpretation of the connectives ∧ and ∨ makes it extremely easy to understand why distributivity fails. Suppose that cars were divided into three colors: black, red, and white. A member of the composite category black cars ∨ (white cars ∧ red cars) must have all the features in the de- scription of black cars, because white cars ∧ red cars is so general that it does not add anything to this list. However, this is not the same as having membership in (black cars ∨ white cars) ∧ (black cars ∨ red cars), because both black cars ∨ white cars and black cars ∨ white cars include every car and thus their composition via ∧ does not restrict the extension of the resulting category to cars that are black. With this working understanding of and , we can recognize the nor- mal box-type operator on Ð+ as capturing the beliefs of individual agents vis-à-vis the objects or the features that belong to a given category. Consis- tently with this, we read Í, a ϕ as “a belongs to ϕ according to the agent,” and Í, x ϕ as “x describes ϕ according to the agent.” The normality conditions > = > and (ϕ ∧ψ) = ϕ ∧ψ can be understood as rationality requirements, i.e., the agent correctly recognizes the (uninformative) cate- gory > as such, and her understanding of the greatest common subcategory of any two categories ϕ and ψ is the greatest common subcategory of the categories she understands as ϕ and ψ. On the database side, the agent’s subjective perception of the incidence between objects and features is modeled by a relation R ⊆ A × X , so that aRx intuitively reads “object a has feature x according to the agent.” Unsurprisingly, the additional properties of R (cf. Lemma 4.2.9) can also be viewed as rationality requirements: if aRx then aR y for every y > x, Application to Organization Theory 141 which means that, if the agent attributes feature x to object a then she also attributes to a all the features that follow from x. Likewise, if aRx then bRx for every b 6 a, which means that if the agent attributes x to a then she also attributes x to all the objects that possess at least the same features as a. Analogously to the classical case, two modal operators  and _ are associated with the same relation R, but unlike the classical case these operations are not dual to one another in the sense of, e.g., _B ¬¬; instead, they are adjoints, which means that, for all u, v ∈ Ð+, _u 6 v i u 6 v. Therefore, rather than encoding the dual perspective on the subjective information encoded by , _ encodes exactly the same perspective as as , except that _ is geared towards the objects whereas  is geared towards the features. If we denote by j and m the categories 4 generated by object j and feature m, respectively, for every j , m we have:

_j 6 m i j Rm i j 6 m. (4.40)

As a result, the information j Rm, which reads “the agent attributes feature m to object j ,” is equivalently encoded on the side of categories by saying that m describes the category _j, i.e., the one the agent understands as the category generated by j , and by saying that j is a member of the category m, i.e., the one the agent understands as the category generated by m. With regard to the clauses of the recursive denition of and , Í, a ϕ is the case i, for all the features x ∈ X , if Í, x ϕ then aRx. That is, object a is recognized by the agent as a member of ϕ i the agent attributes to a all the features that belong to the description of ϕ. Similarly, Í, x ϕ is the case i, for all the objects a ∈ A, if Í, a ϕ then aI x. That is, feature x pertains to the description of ϕ according to the agent i x is shared by every object the agent recognizes as a member of ϕ. It is worth considering how two modal axioms that are relatively common in epistemic logic, namely those referred to at the beginning of this chapter as reexivity (p 6 p) and transitivity (p 6 p), can be interpreted in our logical framework. The axiom p 6 p is interpreted epistemically as the factivity of knowledge, which means that “if the agent knows p then p is true.” The rst-order correspondent of the factivity axiom on RS- frames is [a[x(aRx → aI x). This expresses a form of factivity because it requires that, whenever the agent attributes feature x to object a, then it is indeed the case that x is a feature of a. The axiom p 6 p is interpreted epistemically as the positive introspection of knowledge: “if the agent knows p then she knows that she knows p.” The rst-order correspondent of the positive introspection axiom on RS-frames is [a[m(aRm → R −1[m]↑ ⊆ 142 Classification Systems as Concept Lattices

R[a]).8 This means that, if an agent attributes feature m to object a, then she also attributes to a all the features shared by the objects to which she attributes m. To better understand the link between this condition and the positive introspection of knowledge, consider the category m, i.e., the category the agent understands as generated by feature m.9 This category can be identied with the tuple (R −1[m], R −1[m]↑): that is, the members of m are the objects to which the agent attributes m10 and the description of m is the set of the features that the objects in R −1[m] have in common. By denition, bI z for every b ∈ R −1[m] and z ∈ R −1[m]↑. The rst-order corre- spondent of p 6 p requires that bR z for such b and z. Therefore, while 4 factivity corresponds to R ⊆ I , i.e., what the agent believes is objectively the case, positive introspection yields the reverse inclusion restricted to objects and features within “boxed categories,” i.e., the agent is aware of the features shared by the objects in the categories she knows. Whether factivity and positive introspection are incorporated into this framework or not, the result of our characterization is a formal language that can describe categorization in real-world contexts while fully preserving the distinction between categories that are only possible (i.e., candidate) vs. those that are actually meaningful (i.e., real). Further, this system accounts for conceptual combination through dierent mechanisms, ∧ and ∨, which respectively return the greatest common subordinate category and the smallest common superordinate category of any two given categories. All of this is achieved without imputing excessive computational ability to the agents, who can be aware of only a small number of objects and features and even be wrong in attributing certain features to objects. Although dierent agents are not required to agree as to which objects or features constitute a category, it is reasonable to presume that they derive utility from coordination [cf. 54, 67] and thus seek to reach some sort of consensus with regard to their category denitions. To this end, they can interact with one another and learn about each other’s beliefs. We now turn to characterizing this interaction, which culminates in the identication of a subset of categories whose intension and extension are agreed upon by the agents. These categories are assigned a label [cf. 49] and become the build- ing blocks of discourse. Thus emerges a classication system from three fundamental determinants: factual information, subjective perception, and social interaction.

8Recall that R −1[x] B {a | aRx } (cf. Denition 4.2.7). 9The same argument would hold more generally for any category ϕ. 10Recall that R −1[m] is a Galois-stable set (cf. Denition 4.2.7). Application to Organization Theory 143

4.3.2. Category Emergence In the organizational literature, category emergence refers to the sociocog- nitive mechanism whereby groups of objects (or lists of features) turn from being merely possible to being actually meaningful to agents in a market [22]. Researchers’ interest in the processes through which new market categories are born [e.g., 68] has grown substantially in recent years, as it has become increasingly important to explain how classication systems change over time. As a rst step toward this purpose, we oer an account of category emergence as the outcome of social interaction [54, 69]. For simplicity we consider a setting with only two agents, but the theory can be extended to include any nite number. In this minimal setting, we introduce a bimodal logic L that extends the 4 basic normal lattice expansion logic with two unary normal box-type modal operators 1 and 2 as well as the axioms i p 6 p and i p 6 ii p for 1 6 i 6 2. Models for this logic are structures (Ð, R1, R2, v) such that

(a) Ð = (A, X , I ) is an RS-polarity;

(b) Ri ⊆ A × X for 1 6 i 6 2 such that the following conditions hold:

−1 ↑↓ −1 1. [x(Ri [x] ⊆ Ri [x]), ↓↑ 2. [a(Ri [a] ⊆ Ri [a]),

3. Ri ⊆ I , −1 ↑ 4. [a[x(aRi x → Ri [x] ⊆ Ri [a]); (c) and v is an assignment that associates each p ∈ Prop to an element + of Ð , viewed both extensionally as V1(p) ⊆ A and intensionally as ↓ ↑ V2(p) ⊆ X , in such a way that V1(p) = V2(p) and V2(p) = V1(p) .

We proceed by implementing a construction reminiscent of what classical epistemic logic refers to as “common knowledge” [60]. When applied to our framework, this construction gives an expansion LC of the bimodal lattice expansion logic above with a normal box-type operator C . The interpretation of this modal operator on Ð+, given the additional axioms, is as follows: for any u ∈ Ð+, Û C (u) B su, (4.41) s ∈S where S is the set of all compound modalities of the forms (ij )n and i (j i )n , for 1 6 i , j 6 2 and for some n ∈ Î. 144 Classification Systems as Concept Lattices

Lemma 4.3.1. C (u) 6 u and C (u) 6 C (C (u)) for any u ∈ Ð+.

Proof. Clearly, C (u) 6 1u 6 u, which proves the rst inequality. ! Û Û Û Û Û Û 0 C (C (u)) = sC (u) = s tu = stu > s u = C (u) (4.42) s ∈S s ∈S t ∈S s ∈S t ∈S s0 ∈S 

Let RC , Rs ⊆ A × X for any s ∈ S be dened as follows: aRs x i a 6 sx Ñ and aRC x i a 6 C (x). Clearly, RC = s ∈S Rs . In the standard setting of epistemic logic, the accessibility relations associated with the agents do 4 not encode their knowledge but rather their uncertainty: therefore, the relation associated with the common knowledge operator is dened as the reexive transitive closure of the union of the relations associated with individual agents. Normally, this is bigger than any relation associated with an individual agent. In our setting, however, the relations associated with each agent encode what the agents positively know, rather than what they are uncertain about. For this reason, the common knowledge relation RC is the intersection of the relations Rs encoding the nite iterations, which is normally smaller than any individual agent’s relation. Because C and every s ∈ S are compositions of normal box-operators, they are themselves normal box-operators. Consequently, the relations RC and Rs to which they give rise are RS-compatible (cf. Denition 4.2.7). The correspondence reductions discussed in Section 4.2.4 can thus be applied to C and RC , yielding the following:

Lemma 4.3.2. The relation RC dened above veries the following condi- −1 ↑ tions: RC ⊆ I , and [a[x(aRC x → RC [x] ⊆ RC [a]). + For any category label ϕ, the category C (ϕ) = Ó{C (m) | ϕÐ 6 m}. In what follows, we restrict our attention to categories C (m) for some feature m ∈ X . The extension of C (m) comprises the objects belonging to −1 Ñ −1 RC [m] = ( s ∈S Rs ) [m], whereas the description of C (m) comprises the −1 ↑ Ñ −1 ↑ features belonging to RC [m] = (( s ∈S Rs ) [m]) . The category C (m) can be understood as a socially constructed category, for which the category members (i.e., the extension) and the membership requirements (i.e., the intension or description) are agreed upon by the agents. Clearly, there are fewer such categories than there are candidate categories in the database. This is consistent with the intuition that not all the possible categories whereby the objects or the features could be sorted are actually meaningful to decision-makers in a market [e.g., 70]. Discussion 145

We conclude this section with an important consideration about our interpretation of C . Although we describe category emergence as a so- ciocognitive process [cf. 69], our formalization does not (yet) fully capture epistemic dynamicity. In fact, while we characterize agents as inclined to social interaction, we do not (yet) account for the fact that interaction can amount to more than a simple exchange of beliefs. Through communica- tion, the agents may also end up inuencing each other’s perspectives: for example, by convincing their peers that certain beliefs are more accurate or simply more useful [cf. 54]. Hence, their epistemic iterations may not only lead to the identication of categories whose meaning is shared but also to updates in beliefs. In our discussion, we will explain how these dynamics can be incorporated in our theory. 4

4.4. Discussion In this chapter, we presented a formal theory of classication systems that extends the basic framework of Formal Concept Analysis [44] building on the idea that a database with information about objects and features can be “coerced” into an RS-polarity and hence an RS-frame by enforcing the RS-conditions [cf. 41], which allows for an algebraic representation of clas- sication systems as perfect lattices [40]. As in FCA, this lattice represents the hierarchy of formal concepts generated by objects and features in the database; unlike FCA, however, our theory accounts for the possibility that the agents may have incomplete perceptions of objects, features, and their incidence relation. As a result, they may have idiosyncratic views of the perfect lattice. Moreover, we fashioned an epistemic framework in which some of the concepts or categories are recognized by every agent and through social interaction they are found to be consensual, but some are recognized only by a few agents and some are not recognized by at all, and thus they remain fully implicit in the system. By allowing for such dierent levels of acknowledgment, our theory captures the distinction between categories that are meaningful and those that are merely possible. The contribution of this chapter is twofold: To logicians, we oer an original interpretation of RS-semantics in terms of agents’ reasoning about objects, about features, and about categories induced by their accompany- ing relation. We showed that RS-semantics lend themselves well to such an epistemic interpretation, and that this interpretation, in turn, enables a much more intuitive understanding of RS-semantics. To the organiza- tion theorists, instead, we oer a logical framework capable of capturing key aspects of categorization in markets, such as the subjective nature of 146 Classification Systems as Concept Lattices

category representations [59], the role played by audience interaction [54], and the sociocognitive mechanism underlying category emergence [69]. As this framework allows one to compute the consequences of adding or re- moving objects and features, whether these are new or already associated with some other category, it allows for a very ne-grained analysis of the changes induced by dierent kinds of innovation [cf. 71, 72]. Although our proposal has a distinctly epistemic character, it diers from standard epistemic logic in at least two respects. First, the relations used to interpret the epistemic operators are intended to capture positive knowledge rather than uncertainty. Second, these relations link objects to features and vice versa rather than possible worlds to one another. We 4 considered two classical principles of epistemic logic, namely factivity and positive introspection of knowledge, and using correspondence theory [26], we computed the relational properties that correspond to these principles. These are necessary and sucient conditions on the agents’ subjective perception of the incidence relation between objects and features that ensure the agents’ understanding of categories veries these common epistemic principles. These conditions may or may not be added to the system, depending on the computational ability and degree of access to reality one wishes to impute to the agents. Many research directions connected to epistemic logic are available for further study: in addition to most standard logical questions concerning axiomatization, proof system, decidability, and complexity, one can ask what is the meaning of other classical epistemic principles, like negative introspection, in our original setting. It is also reasonable to ask whether additional principles exist that should be included in a minimal logic of categorization. Because this chapter is a foray into the use of RS-frames to model agents’ reasoning about categories in real-world contexts, it remains quite general in its assumptions. To be of greater practical relevance, our formalism should be specialized to individual elds of research where categorization is important. These are not necessarily related to the study of organizations or economic activity: for example, RS-frames can also be useful in the study of natural language semantics. This is because the assignments of RS- models support a notion of meaning dierent from the one normally used in classical modal logic, but this is arguably closer to what categories represent in natural language, that is, semantic groups that can be specied both intensionally and extensionally. In natural language semantics, linguistic utterances are assigned a meaning in the same spirit, generalizing the truth- based semantics of sentences. Categorization is key to the construction of meaning because each word is associated with a category: for this reason, References 147 exploring systematic connections between categories and natural language can be a very fruitful endeavor. Categories are also central to the study of knowledge representation. Description logics [73] are the dominant paradigm for logical reasoning in this context, but our formalism can provide a complementary perspective on the formal ontologies, classication systems, and taxonomies studied in this eld. In particular, the non-distributive nature of category combi- nation thorugh ∧ and ∨ and the double interpretation of categories as sets of objects and set of features are foreign to description logics, but they can enhance our understanding of formal ontologies. The question remains of whether some of the expressive features of description logic, e.g., uniqueness quantication and qualied cardinality restrictions, can 4 be accommodated in our framework. In conclusion, we suggest two possible extensions that concern the application of our framework to the study of categories in organization theory. First, it is necessary to account for the fact that the membership of products and organizations in particular categories is not necessarily crisp: instead, it is often a matter of degree [74]. This calls for quantitative, possibly many-valued generalizations of our semantics. Second, agents’ beliefs are not necessarily static: they continuously change as new objects, new features, and new combinations of existing features appear on the market. While our theory accounts for category emergence given a set of agents with certain beliefs, it does not (yet) account for belief updates. Dynamic versions of our formalism ought to be developed in order to properly deal with such inherently changeable systems. In recent work [75, 76], logicians proposed a methodology for developing dynamic versions of nonclassical epistemic logics and successfully applied this to settings in which the agents’ beliefs are probabilistic [77]. We plan to apply this methodology to more fully account for category dynamics and track changes in the way the agents perceive them.

4.5. References [1] K. Healy, Fuck nuance, Sociological Theory 35, 118 (2017). [2] P. M. Blau, A formal theory of dierentiation in organizations, American Sociological Review 35, 201 (1970). [3] R. Adner, L. Pólos, M. Ryall, and O. J. Sorenson, The case for formal theory, Academy of Management Review 34, 201 (2009). [4] J. Kamps and L. Pólos, Reducing uncertainty: A formal theory of organi- zations in action, American Journal of Sociology 104, 1776 (1999). 148 Classification Systems as Concept Lattices

[5] J. D. Thompson, Organizations in Action: Social Science Bases of Ad- ministrative Theory (Transaction, New Brunswick, NJ, 1967). [6] M. T. Hannan, On logical formalization of theories from organizational ecology, Sociological Methodology 27, 145 (1997). [7] M. T. Hannan, Rethinking age dependence in organizational mortality: Logical formalizations, American Journal of Sociology 104, 126 (1998). [8] G. L. Péli, The niche hiker’s guide to population ecology: A logical reconstruction of organization ecology’s niche theory, Sociological Methodology 27, 1 (1997). [9] G. L. Péli, Fit by founding, t by adaptation: Reconciling conicting orga- nization theories with logical formalization, Academy of Management 4 Review 34, 343 (2009). [10] G. L. Péli, J. Bruggeman, M. Masuch, and B. Ó Nualláin, A logical ap- proach to formalizing organizational ecology, American Sociological Review 59, 571 (1994). [11] G. L. Péli and M. Masuch, The logic of propagation strategies: Axioma- tizing a fragment of organizational ecology in rst-order logic, Organi- zation Science 8, 310 (1997). [12] J. Bruggeman, Niche width theory reappraised, Journal of Mathematical Sociology 22, 201 (1997). [13] M. T. Hannan, L. Pólos, and G. R. Carroll, Logics of Organization Theory: Audiences, Codes, and Ecologies (Princeton University Press, Princeton, NJ, 2007). [14] H. Sahlqvist, Completeness and correspondence in the rst and second order semantics for modal logic, in Proceedings of the Third Scan- dinavian Logic Symposium, Studies in Logic and the Foundations of Mathematics, Vol. 82, edited by S. Kanger (Elsevier, Amsterdam, Nether- lands, 1975) pp. 110–143. [15] J. F. A. K. Van Benthem, Correspondence theory, in Handbook of Philo- sophical Logic, Volume II: Extensions of Classical Logic, edited by D. M. Gabbay and F. Guenthner (D. Reidel, Dordrecht, Netherlands, 1984) pp. 167–247. [16] P. Blackburn, M. de Rijke, and Y. Venema, Modal Logic, Cambridge Tracts in Theoretical Computer Science, Vol. 53 (Cambridge University Press, New York, NY, 2002). [17] C.-J. Liau, Belief, information acquisition, and trust in multi-agent systems—A modal logic formulation, Articial Intelligence 149, 31 (2003). [18] P. Battigalli and G. Bonanno, Recent results on belief, knowledge, and the epistemic foundations of game theory, Research in Economics 53, 149 (1999). References 149

[19] G. Bonanno, Modal logic and game theory: Two alternative approaches, Risk, Decision, and Policy 7, 309 (2002). [20] L. Pólos, M. T. Hannan, and G. R. Carroll, Foundations of a theory of social forms, Industrial and Corporate Change 11, 85 (2002). [21] L. Pólos, M. T. Hannan, and G. Hsu, Modalities in sociological arguments, Journal of Mathematical Sociology 34, 201 (2010). [22] G. Hsu, M. T. Hannan, and L. Pólos, Typecasting, legitimation, and form emergence: A formal theory, Sociological Theory 29, 97 (2011). [23] W. Conradie, A. Palmigiano, and S. Sourabh, Algebraic modal corre- spondence: Sahlqvist and beyond, Journal of Logical and Algebraic Methods in Programming 91, 60 (2016). [24] W. Conradie and A. Palmigiano, Algorithmic correspondence and canon- 4 icity for distributive modal logic, Annals of Pure and Applied Logic 163, 338 (2012). [25] A. Palmigiano, S. Sourabh, and Z. Zhao, Sahlqvist theory for impossible worlds, Journal of Logic and Computation 27, 775 (2017). [26] W. Conradie and A. Palmigiano, Algorithmic correspondence and canon- icity for non-distributive logics, https://arxiv.org/abs/1603.08515 (2016), accessed June 18, 2017. [27] W. Conradie and C. Robinson, On Sahlqvist theory for hybrid logics, Journal of Logic and Computation 27, 867 (2015). [28] W. Conradie and A. Craig, Canonicity results for mu-calculi: An algorith- mic approach, Journal of Logic and Computation 27, 705 (2017). [29] W. Conradie, A. Craig, A. Palmigiano, and Z. Zhao, Constructive canonic- ity for lattice-based xed point logics, in Logic, Language, Information, and Computation, Lecture Notes in Computer Science, Vol. 10388, edited by J. Kennedy and R. J. G. B. de Queiroz (Springer, Berlin/Heidelberg, Germany, 2017) pp. 145–164. [30] W. Conradie and A. Palmigiano, Constructive canonicity of inductive inequalities, https://arxiv.org/abs/1603.08341 (2016), accessed June 18, 2017. [31] W. Conradie, A. Palmigiano, and Z. Zhao, Sahlqvist via translation, https://arxiv.org/abs/1603.08220 (2016), accessed June 18, 2017. [32] W. Conradie, A. Palmigiano, S. Sourabh, and Z. Zhao, Canonicity and relativized canonicity via pseudo-correspondence: An application of ALBA, https://arxiv.org/abs/1511.04271 (2016), accessed June 18, 2017. [33] A. Palmigiano, S. Sourabh, and Z. Zhao, Jónsson-style canonicity for ALBA-inequalities, Journal of Logic and Computation 27, 817 (2017). [34] Z. Zhao, Algorithmic correspondence and canonicity for possibility semantics, https://arxiv.org/abs/1612.04957 (2016), accessed June 18, 150 Classification Systems as Concept Lattices

2017. [35] Z. Zhao, Algorithmic Sahlqvist preservation for modal compact Haus- dor spaces, in Logic, Language, Information, and Computation, Lec- ture Notes in Computer Science, Vol. 10388, edited by J. Kennedy and R. J. G. B. de Queiroz (Springer, Berlin/Heidelberg, Germany, 2017) pp. 387–400. [36] S. Frittella, A. Palmigiano, and L. Santocanale, Dual characterizations for nite lattices via correspondence theory for monotone modal logic, Journal of Logic and Computation 27, 639 (2017). [37] G. Greco, M. Ma, A. Palmigiano, A. Tzimoulis, and Z. Zhao, Unied corre- spondence as a proof-theoretic tool, Journal of Logic and Computation, 4 in press (2016). [38] M. Ma and Z. Zhao, Unied correspondence and proof theory for strict implication, Journal of Logic and Computation 27, 921 (2017). [39] W. Conradie, S. Ghilardi, and A. Palmigiano, Unied correspondence, in Johan van Benthem on Logic and Information Dynamics, Outstanding Contributions to Logic, Vol. 5, edited by A. Baltag and S. Smets (Springer, Berlin/Heidelberg, Germany, 2014) pp. 933–975. [40] G. Birkho, Lattice Theory (American Mathematical Society, New York, NY, 1940). [41] M. Gehrke, Generalized Kripke frames, Studia Logica 84, 241 (2006). [42] G. L. Murphy, The Big Book of Concepts (MIT Press, Cambridge, MA, 2002). [43] B. Ganter and R. Wille, Applied lattice theory: Formal Concept Analy- sis, in General Lattice Theory, edited by G. Grätzer (Birkhäuser, Basel, Switzerland, 1997) pp. 591–606. [44] B. Ganter and R. Wille, Formal Concept Analysis: Mathematical Founda- tions (Springer, Berlin/Heidelberg, Germany, 1999). [45] E. W. Zuckerman, T.-Y. Kim, K. Ukanwa, and J. von Rittmann, Robust identities or nonentities? Typecasting in the feature-lm labor market, American Journal of Sociology 108, 1018 (2003). [46] H. Rao, P. Monin, and R. Durand, Border crossing: Bricolage and the erosion of categorical boundaries in French gastronomy, American Sociological Review 70, 968 (2005). [47] G. Negro, M. T. Hannan, and M. Fassiotto, Category signaling and repu- tation, Organization Science 26, 584 (2015). [48] G. Hsu and S. Grodal, Category taken-for-grantedness as a strategic opportunity: The case of light cigarettes, 1964 to 1993, American Socio- logical Review 80, 28 (2015). [49] E. G. Pontikes and M. T. Hannan, An ecology of social categories, Socio- References 151

logical Science 1, 311 (2014). [50] E. H. Rosch, Principles of categorization, in Cognition and Categoriza- tion, edited by E. H. Rosch and B. B. Lloyd (Lawrence Erlbaum Associates, Hillsdale, NJ, 1978) pp. 27–48. [51] E. H. Rosch, C. B. Mervis, W. D. Gray, D. M. Johnson, and P. Boyes-Braem, Basic objects in natural categories, Cognitive Psychology 8, 382 (1976). [52] J.-P. Vergne and T. Wry, Categorizing categorization research: Review, integration, and future directions, Journal of Management Studies 51, 56 (2014). [53] R. Durand and M. Khaire, Where do market categories come from and how? Distinguishing category creation from category emergence, Jour- nal of Management 43, 87 (2017). 4 [54] Ö. Koçak, M. T. Hannan, and G. Hsu, Emergence of market orders: Audience interaction and vanguard inuence, Organization Studies 35, 765 (2014). [55] G. Hsu and M. T. Hannan, Identities, genres, and organizational forms, Organization Science 16, 474 (2005). [56] M. T. Kennedy, Y.-C. Lo, and M. Lounsbury, Category currency: The changing value of conformity as a function of ongoing meaning con- struction, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 369–397. [57] M. Khaire and R. D. Wadhwani, Changing landscapes: The construction of meaning and value in a new market category—Modern Indian art, Academy of Management Journal 53, 1281 (2010). [58] G. L. Murphy and D. L. Medin, The role of theories in conceptual coher- ence, Psychological Review 92, 289 (1985). [59] G. Cattani, J. F. Porac, and H. Thomas, Categories and competition, Strategic Management Journal 38, 64 (2017). [60] R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi, Reasoning About Knowledge (MIT Press, Cambridge, MA, 2003). [61] B. A. Davey and H. A. Priestley, Introduction to Lattices and Order (Cam- bridge University Press, New York, NY, 2002). [62] J. M. Dunn, M. Gehrke, and A. Palmigiano, Canonical extensions and re- lational completeness of some substructural logics, Journal of Symbolic Logic 70, 714 (2005). [63] J. A. Hampton, Inheritance of attributes in natural concept conjunctions, Memory & Cognition 15, 55 (1987). [64] E. E. Smith, D. N. Osherson, L. J. Rips, and M. Keane, Combining proto- types: A selective modication model, Cognitive Science 12, 485 (1988). 152 Classification Systems as Concept Lattices

[65] J. A. Hampton, Conceptual combination, in Knowledge, Concepts, and Categories, edited by K. Lambert and D. Shanks (MIT Press, Cambridge, MA, 1997) pp. 133–159. [66] J. Feldman, Minimization of Boolean complexity in human concept learning, Nature 407, 630 (2000). [67] N. P. Mark, Birds of a feather sing together, Social Forces 77, 453 (1998). [68] C. Navis and M. A. Glynn, How new market categories emerge: Temporal dynamics of legitimacy, identity, and entrepreneurship in satellite radio, 1990–2005, Administrative Science Quarterly 55, 439 (2010). [69] J. A. Rosa, J. F. Porac, J. Runser-Spanjol, and M. S. Saxon, Sociocognitive dynamics in a product market, Journal of Marketing 63, 64 (1999). 4 [70] C. Navis, G. Fisher, R. Raaelli, M. A. Glynn, and L. Watkiss, The market that wasn’t: The non-emergence of the online grocery category, in Pro- ceedings of the New Frontiers in Management and Organizational Cog- nition Conference, https://eprints.maynoothuniversity.ie/4055/ (2012), accessed April 6, 2017. [71] N. M. Wijnberg, Innovation and organization: Value and competition in selection systems, Organization Studies 25, 1413 (2004). [72] M. Eisenman, Understanding aesthetic innovation in the context of technological evolution, Academy of Management Review 38, 332 (2013). [73] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel- Schneider, The Description Logic Handbook: Theory, Implementation, and Applications (Cambridge University Press, New York, NY, 2010). [74] M. T. Hannan, Partiality of memberships in categories and audiences, Annual Review of Sociology 36, 159 (2010). [75] A. A. Kurz and A. Palmigiano, Epistemic updates on algebras, Logical Methods in Computer Science 9, 1 (2013). [76] M. Ma, A. Palmigiano, and M. Sadrzadeh, Algebraic semantics and model completeness for intuitionistic public announcement logic, Annals of Pure and Applied Logic 165, 963 (2014). [77] W. Conradie, S. Frittella, A. Palmigiano, and A. Tzimoulis, Probabilistic epistemic updates on algebras, in Logic, Rationality, and Interaction, Lecture Notes in Computer Science, Vol. 9394, edited by W. van der Hoek, W. H. Holliday, and W.-F. Wang (Springer, Berlin/Heidelberg, Germany, 2015) pp. 64–76. 5 Toward an Epistemic Logic of Categories

This chapter is based on a research paper written in collaboration with Willem Conradie, Sabine Frittella, Alessandra Palmigiano, Apostolos Tzimoulis, and Nachoem M. Wijnberg. An earlier version of this paper was presented at the 2017 Conference on Theoretical Aspects of Rationality and Knowledge (University of Liverpool), and published in the Open Publishing Association’s series Electronic Proceedings in Theoretical Computer Science.

153 154 Toward an Epistemic Logic of Categories

5.1. Motivation In the previous chapter, we introduced a formal theory of classication systems that builds on the fundamental mathematical notion of order [1]. As discussed before, this order-theoretic representation can be of great value to organizational research, where the need for an “ontological turn” [2] in the study of categories has been pointed out with increasing frequency. Yet this framework can also be valuable to other domains of scholarship as the literature on categories is expanding rapidly in many disciplines, motivated by (and in connection with) theories and methodologies that span the social and the exact sciences. In linguistics, for example, categories are traditionally examined for their relationship with grammar [3]; in psychology, they have been analyzed extensively for their role in learning and induction [4–6]; in articial intelligence, categorization is recognized as key to pattern recognition [7], text mining [8], and knowledge representation in databases 5 [9]; in management science, categories are known to serve as cognitive infrastructures for consumers [10] and managers alike [11, 12], and thus exert an enormous inuence on competition [13] This range of interdisciplinary applications calls for logical formalisms that achieve generality without losing any of their intuitiveness. The frame- work presented in Chapter4 may not entirely fulll these desiderata be- cause it requires technical restrictions on Kripke-style models, i.e., the RS-conditions, which limit its scope to those settings where such condi- tions can be reasonably enforced. This chapter lays the groundwork for broader applications of our formal theory by considering a simpler and more general class of models for the logic dened in Chapter4. Because it is free from the RS-conditions, this class of models more naturally allows for the representation of objects, features, and concepts as these occur in the disciplines above. The outcome of our study is a new and improved logical framework that can synthesize various perspectives on categorization and facilitate the transfer of results across settings. We demonstrate the utility of this rened machinery by formalizing certain theoretical constructs from the organizational literature on categories—typicality [14], similarity [15, 16], contrast [17–19], and leniency [20, 21]—that are considered important both in sociology [22] and in management science [23]. The structure of this chapter is as follows: In Section 5.2, we review the foundational insights about categories in cognitive psychology and examine their links with the formal approaches currently available to researchers. In Section 5.3 we dene our epistemic logic of categories, for which we introduce a rened Kripke-style semantics and an axiomatization as well as two language enrichments, including a common knowledge-type construc- Theoretical Foundations 155 tion (cf. Chapter4), and hybrid-style nominal and co-nominal variables. In Section 5.4 we prove the soundness and completeness of this logic. In Section 5.5, we propose formalizations for the aforementioned theoretical constructs. Finally, in Section 5.6, we discuss additional applications of our formalism and identify directions for further research.

5.2. Theoretical Foundations 5.2.1. Cognitive Perspectives on Categorization The literature on categories in cognitive psychology provides a multitude of denitions, theories, and models, each of which appears best suited to capture a facet of the complex epistemic mechanisms underlying the formation of categories. The oldest perspective, known as the classical view [24], traces its roots back to Aristotle. Although this is arguably the more restrictive approach, it has been very inuential in computer science [25]. 5 This view hinges on the idea that all the members of a category have some features in common: hence, categorization is viewed as a deductive process geared toward the satisfaction of necessary and sucient conditions. This engenders categories with crisp boundaries, where no partial memberships are possible and where all the members are considered to be equally representative. Clearly, this view fails to account for the fact that people can assign objects to particular categories even if they do not consider them to be highly representative category members. For example, most people agree that robins and penguins are members of the category birds, but robins are more typical birds than penguins and thus they are considered “better” category members [4]. Instead of being crisp, category memberships tend to be a matter of degree [cf. 14]. Not only is this view oblivious to the graded structure of categories [26], but it is also restrictive because it requires the agents to know ex- haustive lists of features in order to determine category memberships. In most real-world settings, this assumption is unwarranted: human beings must cope with considerable cognitive limitations and they are unable to recall long lists of feature requirements. Such considerations motivated the development of prototype theory, which is especially represented by the work of Rosch [27]. According to this perspective, categorization is an inductive process that requires nding the closest match between an object and a cognitive reference point stored in the agent’s memory [28, 29]. Many limitations of the classical view are addressed by relaxing the requirement that category memberships be decided via the satisfaction of necessary and sucient conditions: as a result, prototype theory allows for partial 156 Toward an Epistemic Logic of Categories

and ambiguous cases. Nevertheless, this theory has been found wanting because it requires prototypes to be dened ex ante and does not fully explain how these reference points are acquired by the agents. To explain how categorization occurs in the absence of prototypes, cognitive psycholo- gists proposed the exemplar view [30]. In this view, category memberships are determined by comparing newly encountered objects with instances of concepts that accumulate in the agent’s memory by way of experience. Prototypes for concepts may emerge over time as the agent’s understanding of the membership requirements consolidates [31]. Although the exemplar view does not necessitate ex ante reference points to explain how people sort objects, it is still grounded on the notion of similarity. As noted by Douglas Medin [32], similarity-based accounts of categorization have the power to explain how categories are internally structured but do not fully explain why “we have the categories we have,” 5 or why some categories seem to be more cogent and coherent than others. Moreover, both prototype and exemplar theory assume similarity to be a determinant of conceptual coherence rather than its consequence: in some cases, however, the similarity between objects may be imposed rather than discovered [e.g., 33]. For this reason, it cannot be regarded as the sole criterion for determining category memberships [34]. Pivoting on the notion of imposed coherence, the theory-based view [35] maintains that categories arise in connection with theories (broadly understood so as to include informal explanations). The coherence of a category in the eyes of an agent follows from the coherence of the theory whereon the category is constructed. This allows the agent to sort together objects that would be considered too distant from a similarity-based per- spective: for example, an agent may lump together objects like a gold watch, a family portrait, and a deed to a piece of land into the category of things she wishes her child to inherit. Goal-based categories (cf. Chapter3) can be considered a specic kind of categories that arise from theories, namely the agents’ theories of what objects are suited to particular goals [36, 37]. Although this view allows for considerable freedom in determining category memberships, it can lead to a circularity problem because the categories themselves can serve as building blocks in theory formation. The theory of what constitutes a heirloom, for example, may itself depend on categories like family and private property. In summary, the extant theoretical insights on categorization in cognitive psychology focus on dierent aspects of this complex phenomenon and they are dicult to reconcile into a satisfactory, overarching perspective. This chapter represents an early step within a broader research program Theoretical Foundations 157 aimed at clarifying the notions related to categorization and developing a unied theory. An important step toward this goal is dening an adequate mathematical representation of categories or concepts, upon which an adequate semantic framework can be built that is capable of describing categories regardless of how they are formed. In the next section, we discuss two of the most promising mathematical approaches.

5.2.2. Extant Formal Approaches The formal method for category representation that is perhaps most widely adopted in sociology [16, 22, 38–41] is the one developed by Peter Gär- denfors [42, 43] and based on the notion of conceptual spaces. These are multidimensional metric spaces where each axis or dimension represents a feature along which the objects in a particular domain, like items of clothing [44], software companies [38], or pieces of music [45], can meaningfully dier according to the agents. Concepts, or formal categories, are modeled 5 according to the rules of similarity: the smaller the distance between any two objects, the greater the likelihood that the objects are considered instances of the same concept. From this perspective, categories can be viewed as convex subsets or regions of the conceptual space [46].1 The geometric center of any such region corresponds to the category prototype, and if an object is closer to this geometric average, it tends to be perceived as a more typical (representative) member of the category. Another approach, still relatively foreign to sociologists, is the one pioneered by Bernhard Ganter and Rudolf Wille [47] and termed Formal Concept Analysis (FCA). This builds directly on Birkho’s theory of complete lattices [48]: in FCA, databases are viewed as formal contexts, i.e., structures (A, X , I ) such that A and X are sets and I ⊆ A × X . Intuitively, A can be interpreted as a set of objects, X can be interpreted as a set of features, and for any object a ∈ A and feature x ∈ X , the tuple (a, x) ∈ I exactly when object a has feature x. Every formal context is associated with a collection of formal concepts (or categories): as explained in Chapter4, these are tuples (B,Y ) such that B ⊆ A, Y ⊆ X , and B × Y is a maximal rectangle included in I .2 The set B is commonly referred to as the extension of the formal concept, whereas Y is referred to as its intension or description.3 Because of maximality, the extension of a formal concept uniquely identies

1Recall that a region of space is convex if it includes the segments between any two of its points (cf. Chapter3). In a Euclidean plane, squares are convex but stars are not. 2That is, one cannot enlarge B to B 0 or Y to Y 0 in such a way that B 0 × Y ∈ I or B × Y 0 ∈ I . 3This is an informal way to account for the genesis of Galois-stable sets (cf. Chapter4, Denition 4.2.1). 158 Toward an Epistemic Logic of Categories

and is identied by its description. We say that an object is a member of a category if it belongs to the category’s extension and that a feature describes a category if it belongs to the category’s description. Compared to conceptual spaces, the FCA framework is more general in nature and more strongly oriented towards qualitative comparisons between objects or between categories. The reason why FCA generalizes conceptual spaces can be explained in precise mathematical terms. Formal concepts in conceptual spaces are convex closures of regions: therefore, they represent the xed points of a special closure operator (the convex closure) that arises from the metric of the space. As discussed in Chapter 4, the formation of concepts in FCA is also modeled via a closure operator (the Galois closure); however, this is more general and it is dened in purely order-theoretic terms. As to why FCA provides a more natural support for qualitative comparisons, notice that formal concepts are by nature partially 5 ordered: namely, (B,Y ) is a subconcept of (C, Z ) exactly when B ⊆ C , or equivalently, when Z ⊆ Y .4 Moreover, objects can be ordered in terms of the features they have, whereas features can be ordered in terms of the objects that share them. An object is regarded as more specic than another if it possesses all of its features, as well as some others, while a feature is regarded as more generic than another if it is shared by all of the objects that share the other, as well as some more. This (partial) ordering makes FCA better equipped than conceptual spaces to examine categories as elements of a multilevel classication system [cf. 49]. In Chapter4, we developed a formal connection between FCA and modal logic based on the idea that formal contexts enriched with additional re- lations can be taken as models of an epistemic modal logic of categories. The formulas of this logic are constructed out of a set of atomic variables using the standard positive propositional connectives ∧, ∨, >, ⊥ and modal operators i associated with each agent i ∈ Ag. The formulas thusly gener- ated do not denote states of aairs to which a truth value can be assigned, but rather categories or concepts. In this modal language, it is easy to dis- tinguish between the objective or factual information agents have access to, which is encoded by the formulas of the modal-free fragment of the language, and their subjective interpretation of this information, which is encoded by the formulas with modal operators. One can easily describe an agent’s beliefs about the category that another agent believes to be the category that yet another agent believes to be the category of, e.g., classical

4In this sense, the concepts of a formal context represent a complete lattice, meaning that any collection of formal concepts has a least upper bound and a greatest lower bound. By Birkho’s theorem [48], every complete lattice is isomorphic to some concept lattice. Building an Epistemic-Logical Language 159 music. We dened xed points of these iterations in similar way as common knowledge is dened in classical epistemic logic [50], and proposed an interpretation of this common knowledge-type construction as the end point in a process of social interaction; for example, the consensus reached by a group of agents about the intension and extension of classical music. In this chapter, we will further elaborate on this interpretation and connect it to the notion of typicality.

5.3. Building an Epistemic-Logical Language 5.3.1. Basic Logic and Intended Meaning Let Prop be a nite set of atomic propositions and Ag be a nite set of agents. The basic language Ł of our epistemic logic of categories is:

ϕ B ⊥ | > | p | ϕ ∧ ϕ | ϕ ∨ ϕ | i ϕ, (5.1) where p ∈ Prop. As mentioned above, the formulas in this language denote 5 categories or concepts. While the atomic propositions provide a vocabulary of category labels, such as classical music, compound formulas ϕ ∧ ϕ and ϕ ∨ ψ respectively denote the greatest common subordinate category and the smallest common superordinate category of ϕ and ψ. For any agent i ∈ Ag, the formula i ϕ denotes what is category ϕ according to i . At this stage, we are deliberately vague as to the precise meaning of “according to.” Depending on the properties of i , the formula i ϕ can denote the category i believes or perceives to be ϕ. The selection of a particular meaning can be left up to the specics of the context to which our logic is deployed: in sociological applications, agents’ beliefs are especially important [cf. 51], but with regard to psychology and cognition, the notion of perception may be decidedly more relevant [cf. 52]. The basic or minimal normal L-logic is a set L of sequents ϕ ` ψ (which reads “ϕ is a subordinate category of ψ”) with ϕ, ψ ∈ L, containing (a) the following sequents for propositional connectives:

p ` p, (5.2) ⊥ ` p, (5.3) p ` >, (5.4) p ` p ∨ q, (5.5) q ` p ∨ q, (5.6) p ∧ q ` p, (5.7) p ∧ q ` q; (5.8) 160 Toward an Epistemic Logic of Categories

(b) the following sequents for modal operators:

> ` i >, (5.9) i p ∧ i q ` i (p ∧ q) ; (5.10)

and closed under the following inference rules: ϕ ` χ χ ` ψ , (5.11) ϕ ` ψ ϕ ` ψ , (5.12) ϕ (χ/p) ` ψ (χ/p) χ ` ϕ χ ` ψ , (5.13) χ ` ϕ ∧ ψ ϕ ` χ ψ ` χ , (5.14) 5 ϕ ∨ ψ ` χ ϕ ` ψ . (5.15) i ϕ ` i ψ The modal fragment of L incorporates agents’ beliefs (or perceptions) into the syllogistic reasoning supported by the propositional fragment of L. By an L-logic, we refer to any extension of L with L-axioms ϕ ` ψ.

5.3.2. Interpretation in Enriched Formal Contexts We now turn to discussing the semantic structures that, in our formal theory, play the role of Kripke frames. An enriched formal context is dened as a tuple ¿ = (Ð, {Ri | i ∈ Ag}) such that Ð = (A, X , I ) is a formal context and Ri ⊆ A × X for every i ∈ Ag satisfying certain additional properties, which guarantee their associated modal operators are well dened (cf. Denition 5.4.6). As mentioned above, formal contexts represent databases that contain information about objects a ∈ A and features x ∈ X , as well as an incidence relation I ⊆ A × X . Intuitively, aI x reads “object a has feature x.” In addition to this factual relation, enriched formal contexts contain information about the epistemic attitudes of individual agents. Thus, aRi x reads “object a has feature x according to agent i .” We dene valuation on ¿ as a map V : Prop → P(A) × P(B), with the restriction that V (p) is a formal concept of Ð = (A, X , I ). This means that every p ∈ Prop maps to V (p) = (B,Y ) in such a way that B ⊆ A, Y ⊆ X , and B × Y is a maximal rectangle contained in I . For example, if p is a category label denoting classical music and Ð is a database of musical tracks (stored in A) and musical features (stored in X ), then V interprets the category Building an Epistemic-Logical Language 161 label p in the model Í = (¿,V ) as the formal concept V (p) = (B,Y ) that is equivalently specied by the set of tracks B, i.e., all the classical tracks in the database, and by the set of features Y , i.e., all the features in the database shared by classical tracks. The elements of B are the members of p in Í, whereas the elements of Y describe p in Í. The set B (resp. Y ) is the extension (resp. intension or description) of p in Í, and we denote it [[p]]Í (resp. {[p]}Í) or simply [[p]] (resp. {[p]}). Alternatively, we write:

Í, a p i a ∈ [[p]]Í, (5.16) Í, x p i x ∈ {[p]}Í, (5.17) and read Í, a p as “object a is a member of category p,” and Í, x p as “feature x describes category p.” The interpretation of atomic propositions can be extended to proposi- tional L-formulas as follows (cf. Table 4.1): 5 Í, a > always, (5.18) Í, x > i aI x for all a ∈ A, (5.19) Í, x ⊥ always, (5.20) Í, a ⊥ i aI x for all x ∈ X , (5.21) Í, a ϕ ∧ ψ i Í, a ϕ and Í, a ψ, (5.22) Í, x ϕ ∧ ψ i for all a ∈ A, if Í, a ϕ ∧ ψ then aI x, (5.23) Í, x ϕ ∨ ψ i Í, x ϕ and Í, x ψ, (5.24) Í, a ϕ ∨ ψ i for all x ∈ X , if Í, x ϕ ∨ ψ then aI x. (5.25)

Therefore, in each model Í, > is interpreted as the category generated by the set of all the objects, i.e., the broadest possible category or the one with the laxest (possibly empty) description, whereas ⊥ is interpreted as the category generated by the set of all the features, i.e., the most specic (possibly empty) category or the one with the most restrictive description. Further, ϕ ∧ψ is interpreted as the category generated by the intersection of the extensions of ϕ and ψ; hence, the description of ϕ ∧ψ certainly includes {[ϕ]} ∪ {[ψ]}, but it can also be larger. Conversely, ϕ ∨ ψ is interpreted as the category generated by the intersection of the intensions of ϕ and ψ; hence, the objects in [[ϕ]]∪ [[ψ]] are certainly members of ϕ ∨ ψ, but there can also be others. As to the interpretation of modal formulas:

Í, a i ϕ i for all x ∈ X , if Í, x ϕ then aRi x, (5.26) Í, x i ϕ i for all a ∈ A, if Í, a ϕ then aI x. (5.27) 162 Toward an Epistemic Logic of Categories

Therefore, in each model Í, i ϕ is interpreted as the category whose members are the objects to which agent i attributes every feature in the description of ϕ. Finally, as to the interpretation of sequents:

Í |= ϕ ` ψ i for all a ∈ A, if Í, a ϕ then Í, a ψ. (5.28)

5.3.3. Introducing Common Knowledge In Chapter4, we observed that the logical framework we presented is well- equipped to capture not only the factual information and the epistemic attitudes of individual agents, but also the outcome of social interaction. To this end, we introduced an expansion LC of L with a common knowledge- type operator C . Given Prop and Ag as above, the language LC of the epistemic logic of categories with “common knowledge” is:

5 ϕ B ⊥ | > | p | ϕ ∧ ϕ | ϕ ∨ ϕ | i ϕ | C (ϕ) . (5.29)

and C -formulas are interpreted in each model Í as follows:

Í, a C (ϕ) i for all x ∈ X , if Í, x ϕ then aRC x, (5.30) Í, x C (ϕ) i for all a ∈ A, if Í, a C (ϕ) then aI x, (5.31)

Ñ where RC ⊆ A × X is dened as RC = s ∈S Rs , and Rs ⊆ A × X is the

relation associated with the modal operator s B i1 ... in for any element s = i1 ... in in the set S of nite sequences of elements of Ag. The basic logic of categories with “common knowledge” is a set LC of sequents ϕ ` ψ, with ϕ, ψ ∈ LC , which contains the axioms and is closed under the rules of L, and further contains the following axioms:

> ` C (>) , (5.32) C (p) ∧ C (q) ` C (p ∧ q) , (5.33) Û C (p) ` {i p ∧ i C (p) | i ∈ Ag} ; (5.34)

and is closed under the following rules: ϕ ` ψ , (5.35) C (ϕ) ` C (ψ)

Ó χ ` i ∈Ag i ϕ {χ ` i χ | i ∈ Ag} . (5.36) χ ` C (ϕ) Soundness and Completeness 163

5.3.4. Hybrid Expansions of the Basic Language In many settings, it is convenient to be able to reason about sets of objects or sets of features even if these do not constitute real categories in the sense described in Chapter4 and are not assigned any particular label. For this reason, the languages L and LC can be further enriched with dedicated sets of variables in the style of hybrid logic. As before, let Prop be a nite set of atomic propositions and Ag be a nite set of agents. Given Prop, Ag, and the nite sets Nom and CoNom of nominals and co-nominals, respectively, the language LH of the hybrid logic of categories is:

ϕ B ⊥ | > | p | a | x | ϕ ∧ ϕ | ϕ ∨ ϕ | i ϕ, (5.37) where i ∈ Ag, p ∈ Prop, a ∈ Nom, and x ∈ CoNom. A hybrid valuation on an enriched formal concept ¿ maps atomic propo- sitions to formal concepts, nominal variables to the formal concepts gen- erated by individual elements of the object domain A, and co-nominal 5 variables to the formal concepts generated by individual elements of the feature domain X . Denoting V (a) as the category generated by a ∈ A, and V (x) as the category generated by x ∈ X , nominals and co-nominals are interpreted as follows:

Í, y a i aI y, (5.38) Í, b a i for all y ∈ X , if aI y then bI y, (5.39) Í, y x i for all b ∈ A, if bI x then bI y, (5.40) Í, b x i bI x. (5.41)

5.4. Soundness and Completeness 5.4.1. Definition of I -Compatible Relations Throughout this section, we x two sets A and X , and use a, b (resp. x, y ) for elements of A (resp. X ), and B, C, Aj (resp. Y ,W , Xj ) for subsets of A (resp. X ). For any relation S ⊆ A × X , let

↑ S [B] B {x | [a (a ∈ B ⇒ aSx)} , (5.42) ↓ S [Y ] B {a | [x (x ∈ Y ⇒ aSx)} . (5.43)

The following lemma recaps some well-known properties of this construc- tion [cf.1, Sections 7.22–7.29]: Lemma 5.4.1.

1. B ⊆ C implies S ↑ [C ] ⊆ S ↑ [B], and Y ⊆ W implies S ↓ [W ] ⊆ S ↓ [Y ]. 164 Toward an Epistemic Logic of Categories

2. B ⊆ S ↓ S ↑ [B] and Y ⊆ S ↑ S ↓ [Y ].

3. S ↑ [B] = S ↑ S ↓ S ↑ [B] and S ↓ [Y ] = S ↓ S ↑ S ↓ [Y ]. ↓ Ð Ñ ↓ ↑ Ð Ñ ↑ 4. S [ Y] = Y ∈Y S [Y ] and S [ B] = B ∈B S [B]. For any formal context Ð = (A, X , I ), we sometimes use B ↑ for I ↑[B] and Y ↓ for I ↓[Y ], and say that B (resp. Y ) is Galois-stable if B = B ↑↓ (resp. Y = Y ↓↑). When B = {a} (resp. Y = {x }), we write a ↑↓ for {a}↑↓ (resp. x ↓↑ for {x }↓↑). The Galois-stable sets represent projections of some maximal rectangle (i.e., formal concept) of Ð. The lemma below reports additional facts: Lemma 5.4.2.

1. B ↑ and Y ↓ are Galois-stable. Ð ↑↓ Ð ↓↑ 5 2. B = a ∈B a and Y = y ∈Y y for any Galois-stable B and Y . 3. Galois-stable sets are closed under arbitrary intersections.

Proof. With regard to the second item, because a ↑↓ ⊇ {a}, we have that Ð ↑↓ ↑↓ ↑↓ B ⊆ a ∈B a . For the other direction, if {a} ⊆ B then a ⊆ B . Because B is Galois-stable, we have that B = B ↑↓. Hence, a ↑↓ ⊆ B for any a ∈ B, which Ð ↑↓ implies that a ∈B a ⊆ B. The proof for Y is analogous. 

Denition 5.4.3. For any Ð = (A, X , I ), any R ⊆ A × X is I -compatible if R ↓[x] and R ↑[a] are Galois-stable for all x and all a.

By Lemma 5.4.1 (3), I is an I -compatible relation. Lemma 5.4.4. If R ⊆ A × X is I -compatible, then R ↓[Y ] = R ↓[Y ↓↑] and R ↑[B] = R ↑[B ↑↓].

Proof. By Lemma 5.4.1 (2) we have Y ⊆ Y ↓↑, which implies by Lemma 5.4.1 (1) that R ↓[Y ↓↑] ⊆ R ↓[Y ]. Conversely, if a ∈ R ↓[Y ], i.e., Y ⊆ R ↑[a], then Y ↓↑ ⊆ (R ↑[a])↓↑ = R ↑[a], with the last identity holding because R is I -compatible. Thus, a ∈ R ↓[Y ↓↑]. The proof of the second identity is analogous.  Lemma 5.4.5. If R is I -compatible and Y is Galois-stable, then R ↓[Y ] is Galois-stable. Ð Proof. Because Y = y ∈Y {y }, by Lemma 5.4.1 (4),

  ↓ ↓ Ø  Ù ↓ Ù ↓ R [Y ] = R  {y } = R [{y }] = R [y ] . (5.44)   y ∈Y  y ∈Y y ∈Y   Soundness and Completeness 165

By the I -compatibility of R, the last term is an intersection of Galois-stable sets, which is itself Galois-stable (cf. Lemma 5.4.2, 3). 

The lemma above ensures that the interpretation of L-formulas on enriched formal contexts denes a compositional semantics on formal concepts if the relations Ri are I -compatible. Indeed, for every enriched formal context ¿ = (Ð, {Ri | i ∈ Ag}), every valuation V on ¿ extends to an interpretation map of L-formulas dened as follows:

V (p) = ([[p]], {[p]}) , (5.45)  ↑ V (>) = A, A , (5.46)

 ↓  V (⊥) = X , X , (5.47)

 ↑ V (ϕ ∧ ψ) = [[ϕ]]∩ [[ψ]], ([[ϕ]]∩ [[ψ]]) , (5.48)

 ↓  5 V (ϕ ∨ ψ) = ({[ϕ]} ∩ {[ψ]}) , {[ϕ]} ∩ {[ψ]} , (5.49)    ↑ ( ) ↓ [{[ ]}] ↓ [{[ ]}] V i ϕ = Ri ϕ , Ri ϕ . (5.50)

By Lemma 5.4.5, if V (ϕ) is a formal concept, then so is V (i ϕ).

Denition 5.4.6. An enriched formal context ¿ = (Ð, {Ri | i ∈ Ag}) is compo- sitional if Ri is I -compatible (cf. Denition 5.4.3) for every i ∈ Ag. A model Í = (¿,V ) is compositional if so is ¿.

5.4.2. Interpretation of C For any formal context Ð = (A, X , I ) the I -product of the relations Rs, Rt ⊆ A × X is the relation Rst ⊆ A × X dened as follows:

↓ ↓ h ↑ h ↓ h ↓↑iii a ∈ Rst [x] i a ∈ Rs I Rt x . (5.51)

Lemma 5.4.7. If Rs and Rt are I -compatible, then Rst is I -compatible. ↓ Proof. That Rst [x] is Galois-stable follows from the denition of Rst , from ↑ Lemma 5.4.5, and from the I -compatibility of Rs and Rt . To show that Rst [a] ↑ ↓↑ ↑ is Galois-stable, i.e., that (Rst [a]) ⊆ Rst [a], by Lemma 5.4.2 (2) it suces ↑ ↓↑ ↑ ↑ ↓ to show that if y ∈ Rst [a] then y ⊆ Rst [a]. Let y ∈ Rst [a], i.e., a ∈ Rst [y ] = ↓ ↑ ↓ ↓↑ ↓↑ ↓↑ ↓↑ ↓ Rs [I [Rt [y ]]]. If x ∈ y , then x ⊆ y , which by the antitonicity of Rs , ↑ ↓ ↓ ↑ ↓ ↓↑ ↓ ↑ ↓ ↓↑ I , and Rt (cf. Lemma 5.4.1, 1) implies that Rs [I [Rt [y ]]] ⊆ Rs [I [Rt [x ]]]. ↓ ↑ Therefore, a ∈ Rst [x], i.e., x ∈ Rst [a], as required.  166 Toward an Epistemic Logic of Categories

The denition of I -product serves to semantically characterize the

relation associated with the modal operators s B i1 ... in for any nite non-empty sequence s B i1 ... in ∈ S of elements of Ag in terms of the relations associated with each primitive modal operator. For any such s, let Rs be dened recursively as follows: if s = i , then Rs = Ri , and if s = i t , ↓[ ] ↓[ ↑[ ↓[ ↓↑]]] then Rs x = Ri I Rt x . Lemma 5.4.7 immediately implies:

Corollary 5.4.8. For every s ∈ S, the relation Rs is I -compatible.

↓ Lemma 5.4.9. IfY is Galois-stable and Rs, Rt are I -compatible, then Rst [Y ] = ↓ ↑ ↓ Rs [I [Rt [Y ]]]. Proof. " " " ### ↓ h ↑ h ↓ ii ↓ ↑ ↓ Ø ↓↑ Rs I Rt [Y ] = Rs I Rt x Lemma 5.4.2 (2) (5.52) 5 x ∈Y " " ## ↓ ↑ Ù ↓ h ↓↑i = Rs I Rt x Lemma 5.4.1 (4) (5.53) x ∈Y " " ## ↓ ↑ Ù ↓ h ↑ h ↓ h ↓↑iii = Rs I I I Rt x (5.54) x ∈Y " " " ### ↓ ↑ ↓ Ø ↑ h ↓ h ↓↑ii = Rs I I I Rt x Lemma 5.4.1 (4) (5.55) x ∈Y " " " ### ↓ ↑ ↓ Ø ↑ h ↓ h ↓↑ii = Rs I I I Rt x Lemma 5.4.1 (4) (5.56) x ∈Y " # ↓ Ø ↑ h ↓ h ↓↑ii = Rs I Rt x Lemma 5.4.4 (5.57) x ∈Y Ù ↓ h ↑ h ↓ h ↓↑iii = Rs I Rt x Lemma 5.4.1 (4) (5.58) x ∈Y Ù ↓ = Rst [x] Denition of Rst (5.59) x ∈Y " # ↓ Ø = Rst x Lemma 5.4.1 (4) (5.60) x ∈Y ↓ Ø = Rst [Y ] . Y = x (5.61) x ∈Y

↓ ↓↑ Identity 5.54 follows from the fact that Rt [x ] is Galois-stable. 

Lemma 5.4.10. If Rs, Rt , Rw are I -compatible, then Rs(tw) = R(st )w . Soundness and Completeness 167

Proof. For any x, h h h iii ↓ [ ] ↓ ↑ ↓ ↓↑ Rs(tw) x = Rs I Rtw x Equation 5.51 (5.62) ↓ h ↑ h ↓ h ↑ h ↓ h ↓↑iiiii = Rs I Rt I Rw x Lemma 5.4.9 (5.63)

↓ h ↑ h ↓ h ↓↑iii = Rst I Rw x Lemma 5.4.9 (5.64) ↓ [ ] = R(st )w x . Equation 5.51 (5.65)



Let s = i1 ... in ∈ S and s B i1 ... in .

Lemma 5.4.11. For any model Í = (¿,V ), 5

Í, a sϕ i for all x ∈ X , if Í, x ϕ, then aRs x, (5.66) Í, x sϕ i for all a ∈ A, if Í, a sϕ, then aI x. (5.67)

Proof. By induction on the length of s. The base case of the induction is im- [[ ]] ↓[{[ ]}] ↓[ ↑[[[ ]]]] mediate: let s = i t ; it follows that i t ϕ = Ri t ϕ = Ri I t ϕ = ↓[ ↑[ ↓[{[ ]}]]] ↓[{[ ]}] Ri I Rt ϕ = Rs ϕ . The last equality holds by Lemma 5.4.9. The second item of the lemma is trivially true. 

Lemma 5.4.12. For any family R of I -compatible relations,

1. Ñ R is an I -compatible relation.

Ñ ↓ Ñ ↓ 2. ( R) [Y ] = T ∈R T [Y ] for any Y ⊆ X .

Ñ ↓ Ñ ↓ Proof. As to the rst item, let R = R. It follows that R [x] = T ∈R T [x] ↑ Ñ ↑ and R [a] = T ∈R T [a]. Thus, the statement follows from Lemma 5.4.2 (3). As to the second item,

  Ù ↓ Ù ↓ Ø  Ø T [Y ] = T  y  Y = y (5.68)   T ∈R T ∈R y ∈Y  y ∈Y   Ù Ù ↓ = T [y ] Lemma 5.4.1 (4) (5.69) T ∈R y ∈Y 168 Toward an Epistemic Logic of Categories

Ù Ù ↓ = T [y ] (5.70) y ∈Y T ∈R ↓ Ù Ù  ↓ = R [y ] Denition of (·) (5.71) y ∈Y   Ù  ↓ Ø  = R  y  Lemma 5.4.1 (4) (5.72)   y ∈Y    Ù  ↓ Ø = R [Y ] . Y = y (5.73) y ∈Y

Identity 5.71 follows from the associativity and commutativity of Ñ. 

5 The lemmas above ensure that, in enriched formal contexts where the Ñ relations Ri are I -compatible, the relation RC B s ∈S Rs is likewise I - compatible. Therefore, the interpretation of LC -formulas on the model based on these enriched formal contexts denes a compositional seman- tics on formal concepts. Indeed, for every such enriched formal context ¿ = (Ð, {Ri | i ∈ Ag}), every valuation V on ¿ extends to the following interpretation map of C -formulas:    ↑ ( ( )) ↓ [{[ ]}] ↓ [{[ ]}] V C ϕ = RC ϕ , RC ϕ , (5.74)

so that, if V (ϕ) is a formal concept, then so is V (i ϕ). In addition, the following identity is semantically supported: Û C (ϕ) = sϕ, (5.75) s ∈S

where s B i1 ... in is any nite non-empty string of elements of Ag, and

s B i1 ... in .

5.4.3. Soundness Proposition 5.4.13. For any compositional model Í and any i ∈ Ag,

1. if Í |= ϕ ` ψ, then Í |= i ϕ ` i ψ;

2. Í |= > ` i >;

3. Í |= i ϕ ∧ i ψ ` i (ϕ ∧ ψ). Soundness and Completeness 169

Proof. As to the rst item, by Lemma 5.4.1 (1), if [[ϕ]] ⊆ [[ψ]] then h i h i [[ ]] ↓ ↑ [[[ ]]] ⊆ ↓ ↑ [[[ ]]] [[ ]] i ϕ = Ri I ϕ Ri I ψ = i ψ , (5.76)

As to the second item, it suces to show that [[i >]] = A. By denition, h i [[ >]] ↓ [{[>]}] ↓ ↑ i = Ri = Ri A . (5.77)

↓[ ↑] For this reason, it is enough to show that Ri A = A. The assumption of I - ↑[ ] ∈ compatibility implies that Ri a is Galois-stable for every a A. Therefore, ↑ ⊆ ↑[ ] ∈ ↓[ ↑] ∈ A Ri a . Thus, by adjunction, a Ri A for every a A, which implies ↓[ ↑] that Ri A = A, as required. Finally, with regard to the third item, 5 ↓ ↓ [[ (ϕ) ∧  (ψ)]] = R [{[ϕ]}] ∩ R [{[ψ]}] Denition of [[·]] (5.78) ↓ = R [{[ϕ]} ∪ {[ψ]}] Lemma 5.4.1 (4) (5.79)

↓ h ↑ h ↓ ii = R I I [{[ϕ]} ∪ {[ψ]}] Lemma 5.4.4 (5.80)

↓ h ↑ h ↓ ↓ ii = R I I [{[ϕ]}] ∩ I [{[ψ]}] Lemma 5.4.1 (4) (5.81)

↓ h ↑ i = R I [[[ϕ]]∩ [[ψ]]] (5.82)

↓ h ↑ i = R I [[[ϕ ∧ ψ]]] Denition of [[·]] (5.83) = [[ (ϕ ∧ ψ)]]. Denition of [[·]] (5.84)

Identity 5.82 follows from the fact that V (ϕ),V (ψ) are formal concepts. 

Proposition 5.4.14 (Soundness). For any compositional model Í, Ó 1. Í |= C (ϕ) ` {i ϕ ∧ i C (ϕ) | i ∈ Ag};

Ó Ó 2. if Í |= χ ` i ∈Ag i ϕ and Í |= χ ` i ∈Ag i χ, then Í |= χ ` C (ϕ).

Proof. With regard to the rst item, by denition and by Lemma 5.4.12 (2) ↓ Ñ ↓ Ñ ↓ it holds that [[C (ϕ)]] = R [{[ϕ]}] = s ∈S Rs [{[ϕ]}] ⊆ i ∈Ag R [{[ϕ]}], which Ó C i proves Í |= C (ϕ) ` {i ϕ | i ∈ Ag}. Let i ∈ Ag. The following chain of 170 Toward an Epistemic Logic of Categories

(in)equalities completes the proof: h h ii [[ ( )]] ↓ ↑ ↓ [{[ ]}] [[·]] i C ϕ = Ri I RC ϕ Denition of (5.85) " " ## Ù ↓ ↑ ↓ [{[ ]}] = Ri I Rs ϕ Lemma 5.4.12 (2) (5.86) s ∈S " " ## Ù h h ii ↓ ↑ ↓ ↑ ↓ [{[ ]}] = Ri I I I Rs ϕ (5.87) s ∈S " " " ### Ø h i ↓ ↑ ↓ ↑ ↓ [{[ ]}] = Ri I I I Rs ϕ Lemma 5.4.1 (4) (5.88) s ∈S " # Ø h i ↓ ↑ ↓ [{[ ]}] = Ri I Rs ϕ Lemma 5.4.4 (5.89) s ∈S Ù h h ii 5 ↓ ↑ ↓ [{[ ]}] = Ri I Rs ϕ Lemma 5.4.1 (4) (5.90) s ∈S Ù ↓ [{[ ]}] = Ri s ϕ Lemma 5.4.9 (5.91) s ∈S Ù ↓ ⊇ Rs [{[ϕ]}]{i s | s ∈ S } ⊆ S (5.92) s ∈S = [[C (ϕ)]]. Lemma 5.4.12 (2) (5.93)

↓ Identity 5.87 follows from the fact that Rs [{[ϕ]}] is Galois-stable. With regard to the second item, using Proposition 5.4.13 (1) and the assumptions, one can show that Í |= χ ` sϕ for every s ∈ S. Therefore, [[ ]] ⊆ Ñ ↓[{[ ]}] ↓ [{[ ]}] [[ ( )]] χ s ∈S Rs ϕ = RC ϕ = C ϕ , as required. 

5.4.4. Completeness The completeness of L is proven via a standard canonical model construc- tion. For any lattice Ì with normal operators i , let ¿Ì = (ÐÌ, {Ri | i ∈ Ag}) be dened as follows: ÐÌ = (A, X , I ) where A (resp. X ) is the set of lat- tice lters (resp. ideals) of Ì, and aI x i a ∩ x , ∅. For every i ∈ Ag, let Ri ⊆ A × X be dened by aRi x i i u ∈ a for some u ∈ Ì such that u ∈ x. In what follows, for any a ∈ A and any x ∈ X , we let i x B {i u ∈ Ì | u ∈ x } −1 { ∈ | ∈ } ↓[ ] { | ∩ } and i a B u Ì i u a . Hence, by denition, Ri x = a a i x , ∅ ∈ ↑[ ] { | ∩ −1 } ∈ for any x X and Ri a = x x i a , ∅ for any a A. Notice also that −1 i > = > implies that i a , ∅ for every a ∈ A.

Lemma 5.4.15. For ¿Ì dened as above and any a ∈ A, x ∈ X and i ∈ Ag, Soundness and Completeness 171

h i ↑ ↓ [ ] { ∈ | ⊆ } 1. I Ri x = y X i x y ;

↓  −1 2. I [Ri [a]] = b ∈ A | i a ⊆ b ;

h h ii ↓ ↑ ↓ [ ] { ∈ | ∩ } ↓ [ ] 3. I I Ri x = b A i x b , ∅ = Ri x ;

h h ii ↑ ↓ ↑ [ ]  ∈ | −1 ∩ ↑ [ ] 4. I I Ri a = y X i a y , ∅ = Ri a .

Proof. The rst and second items immediately follow from the denitions of −1 i x and i a. As to the third and fourth items, from the previous two items it ↓[ ↑[ ↓[ ]]] { ∈ | d e ∩ } ↑[ ↓[ ↑[ ]]] { ∈ follows that I I Ri x = b A i x b , ∅ and I I Ri a = y −1 −1 X | bi ac ∩ y , ∅}, where di xe and bi ac respectively denote the ideal −1 generated by i x and the lter generated by i a. Via the monotonicity of 5 { ∈ | d e∩ } { ∈ | ∩ } ↓[ ] i , one can show that b A i x b , ∅ = b A i x b , ∅ = Ri x , and using the meet preservation of i , one can further show that {y ∈ X | b −1 c ∩ } { ∈ | −1 ∩ } ↑[ ] i a y , ∅ = y X i a y , ∅ = Ri a , as required. Notice that −1 the last equality holds for every a ∈ A under the assumption that i a , ∅. As remarked above, this is guaranteed by i being normal. 

The third and fourth items of the lemma above immediately imply that:

Lemma 5.4.16. ¿Ì is a compositional enriched formal context (cf. Denition 5.4.6).

Recall that S is the set of non-empty nite sequences of elements of Ag.

Lemma 5.4.17. If x is the ideal generated by some u ∈ Ì, then, for every ↓ s ∈ S, Rs [x] = {a | su ∈ a}.

∈ ↓[ ] Proof. By induction on the length of s. If s = i then aRi x i a Ri x i a ∩ i x , ∅. Because x is the ideal generated by u, we have that u is the greatest element of x. Hence, the monotonicity of i implies that i u is the greatest element of i x. Because a is a lter, and thus it is upward-closed, a ∩ i x , ∅ is equivalent to i u ∈ a, which yields proof of the base case of ↓ the induction. Let us now assume that Rs [x] = {b ∈ A | su ∈ b} and show ↓ [ ] { ∈ | ∈ } that Ri s x = b A i su b . By Lemma 5.4.15 (3, 4) and Lemma 5.4.7, Rs 172 Toward an Epistemic Logic of Categories

is I -compatible for every s ∈ S. Now let z be the ideal generated by su. h h ii ↓ [ ] ↓ ↑ ↓ [ ] Ri s x = Ri I Rs x Lemmas 5.4.4 and 5.4.9 (5.94) h i ↓ ({ ∈ | ∈ })↑ = Ri b A su b Induction hypothesis (5.95) ↓ [{ ∈ | ∈ }] = Ri y X su y (5.96) ↓ [ ] = Ri z Denition of z (5.97) = {a | i su ∈ a} Base case (5.98) = {a | i su ∈ a} . Denition of i s (5.99)

Identity 5.96 follows from the fact that the lter generated by su is the ↓ smallest element of Rs [x]. 

The canonical enriched formal context is dened by instantiating the 5 construction above to the Lindembaum-Tarski algebra of L. In this case, let V be the valuation such that [[p]] (resp. {[p]}) is the set of the lters (resp. ideals) to which p belongs, and let Í = (¿L,V ) be the canonical model. Lemma 5.4.18 (Truth Lemma). For every ϕ ∈ L, 1. Í, a ϕ i ϕ ∈ a; 2. Í, x ϕ i ϕ ∈ x.

Proof. By induction on ϕ. We only show the inductive step for ϕ B i σ. ∈ ↓ [{[ ]}] Í, a i σ i a Ri σ (5.100) ∈ ↓ [{ | ∈ }] i a Ri x σ x Induction hypothesis (5.101) i a ∈ {b ∈ A | i σ ∈ b} Denition of Ri (5.102) i i σ ∈ a. (5.103)

Í, x i σ i x ∈ {[i σ]} (5.104) ↑ i x ∈ [[i σ]] (5.105) ↑ i x ∈ {a ∈ A | i σ ∈ a} Proof above (5.106) i i σ ∈ x. (5.107)



The weak completeness of L follows from the lemma above with the usual argument. Soundness and Completeness 173

Proposition 5.4.19 (Completeness). If ϕ ` ψ is an L-sequent which is not derivable in L, then Í 6|= ϕ ` ψ.

The weak completeness for LC is proven along the lines of Fagin, Halpern, Moses, and Vardi [50, Theorem 3.3.1]. Namely, for any LC -sequent ϕ ` ψ that is not derivable in LC , we construct a nite model Íϕ,ψ such that Íϕ,ψ 6|= ϕ ` ψ. Let Φ0 be the set whose elements are >, ⊥, and all the subformulas of ϕ and of ψ. Let Ø Φ1 B Φ0 ∪ {i σ | σ ∈ Φ0} , (5.108) i ∈Ag nÛ o Φ B Ψ | Ψ ⊆ Φ1 . (5.109)

By construction, Φ is nite. Consider the canonical model Í dened above, and the following equivalence relations on A and X : a ≡Φ b i a ∩ Φ = b ∩ Φ, (5.110) 5 x ≡Φ y i x ∩ Φ = y ∩ Φ. (5.111) Because Φ is nite, these equivalence relations induce nitely many equiv- alence classes on A and X . In particular, considering ` as a pre-order on Φ, each element a of A/≡Φ is uniquely identied by some Φ-lter, i.e., a `-upward closed subset of Φ that is also closed under existing conjunctions. Analogously, each x ∈ X /≡Φ is uniquely identied by some Φ-ideal, i.e., a `-downward closed subset of Φ that is also closed under existing dis- junctions. In addition, because Φ is closed under conjunctions, the Φ-lter corresponding to each a is principal, i.e.for each a ∈ A/≡Φ, some τa ∈ Φ exists such that a can be identied with the set of the formulas σ ∈ Φ such that τa ` σ is an LC -derivable sequent. In what follows, we abuse notation and let a and x respectively denote the principal Φ-lter and the Φ-ideal with which a and x can be identied, as ∗ discussed above. With this convention we can write i x B {i σ | σ ∈ x }∩Φ −1 ∗ and (i ) a B {τ ∈ Φ | i τ ∈ a}. As a consequence of ⊥, > ∈ Φ0, and ∗ −1 ∗ i > = >, we have that i x and (i ) a are always non-empty. Let us dene:   /≡ /≡ ϕ,ψ Íϕ,ψ = A Φ, X Φ, Iϕ,ψ, Ri ,Vϕ,ψ , (5.112) where

aIϕ,ψ x i a ∩ x , ∅ (5.113)

i τa ∈ x, (5.114) ϕ,ψ ∗ ∩ aRi x i i x a , ∅ (5.115) i τa ` i τ is LC -derivable for some τ ∈ x, (5.116) 174 Toward an Epistemic Logic of Categories

and Vϕ,ψ is any valuation such that [[p]] = {a | p ∈ a} and {[p]} = {x | p ∈ x } for all p ∈ Prop ∩ Φ. Henceforth, we abbreviate Iϕ,ψ as I wherever possible. It readily follows from this denition that [[p]]↑↓ = [[p]] and {[p]}↓↑ = {[p]} for ∈ ∩ ( ϕ,ψ )↓[ ] { | ∩ ∗ } ( ϕ,ψ )↑[ ] any p Prop Φ. Moreover, Ri x = a a i x , ∅ and Ri a = −1 ∗ {x | x ∩ (i ) a , ∅}. From this, analogously to Lemma 5.4.15, it follows that: Lemma 5.4.20. For any a, x and i ∈ Ag,   ↓  ↑ ϕ,ψ [ ]  | ∗ ⊆ 1. Iϕ,ψ Ri x = y i x y ;

↓ [( ϕ,ψ )↑[ ]] { ∈ | ( −1)∗ ⊆ } 2. Iϕ,ψ Ri a = b A i a b ;    ↓  n o   ↓ ↓ ↑ ϕ,ψ [ ] | ∗ ∩ ϕ,ψ [ ] 3. Iϕ,ψ Iϕ,ψ Ri x = b i x b , ∅ = Ri x ;

   ↑  ↑ ↓ ϕ,ψ [ ]  | −1∗ ∩ ↑ [ ] 5 4. Iϕ,ψ Iϕ,ψ Ri a = y i a y , ∅ = Ri a .

The third and fourth items of the lemma immediately imply that: ϕ,ψ ∈ Lemma 5.4.21. Ri is Iϕ,ψ -compatible for any i Ag. The following is key to the proof of the Truth Lemma (5.4.18).

Lemma 5.4.22. If C (σ) ∈ Φ, then the following is an LC -derivable sequent for any i ∈ Ag: Ü Ü ` © ª τa i ­ τa ® . (5.117) a ∈[[C (σ)]] a ∈[[C (σ)]] « ¬ Proof. Fix i ∈ Ag and a ∈ [[C (σ)]]. Because i is monotone, it is enough to show that some τ ∈ Φ exists such that Ü τa ` i τ and τ ` τa . (5.118) a ∈[[C (σ)]] ϕ,ψ ϕ,ψ By the denition of Ri , this is equivalent to showing that aRi y , where Ô ↑ y is the Φ-ideal generated by a ∈[[C (σ)]] τa . Note that {[C (σ)]} = [[C (σ)]] is

the collection of all the Φ-ideals x such that τb ∈ x for every b ∈ [[C (σ)]]. Hence, y ∈ {[C (σ)]} (and it is in fact the smallest element in {[C (σ)]}). Thus, ϕ,ψ [[ ( )]] ⊆ ( ϕ,ψ )↓[{[ ( )]}] to prove that aRi y , it is enough to show that C σ Rs C σ . ϕ,ψ ↓ This immediately follows from the fact that (Rs ) [{[C (σ)]}] = [[i C (σ)]], that C (σ) ` i C (σ) is an LC -derivable sequent, that LC is sound with respect to compositional models (cf. Proposition 5.4.14), and that Íϕ,ψ is a compositional model (cf. Lemma 5.4.21).  Soundness and Completeness 175

Lemma 5.4.23 (Truth Lemma). For every τ ∈ Φ0,

1. Mϕ,ψ, a τ i τ ∈ a;

2. Mϕ,ψ, x τ i τ ∈ x.

Proof. As to the rst item of the lemma, we only show the inductive step for τ B C (σ) for some σ ∈ Φ0. If Mϕ,ψ, a C (σ), i.e., a ∈ [[C (σ)]] = Ñ ( ϕ,ψ )↓[{[ ]}] ∈ ( ϕ,ψ )↓[{[ ]}] [[ ]] ∈ s ∈S Rs σ , then a Ri σ = i σ for any i Ag. By deni- tion, σ ∈ Φ0 implies that i σ ∈ Φ. Moreover,   ↓ ∈ [[ ]] ∈ ϕ,ψ [{[ ]}] a i σ i a Ri σ (5.119)   ↓ ∈ ϕ,ψ [{ | ∈ }] i a Ri x σ x Induction hypothesis (5.120) n o ∈ | ∈ ϕ,ψ i a b i σ b Denition of Ri (5.121) 5

i i σ ∈ a, (5.122) Ó which implies that τa ` i ∈Ag i σ. By Lemma 5.4.22 and the fact that LC is closed under the following rule: Ó χ ` i ∈Ag i ϕ {χ ` i χ | i ∈ Ag} , (5.123) χ ` C (ϕ) we conclude that τa ` C (σ), i.e., C (σ) ∈ a. For the reverse direction, let b be the principal Φ-lter generated by ϕ,ψ ↓ C (σ). Let us show, by induction on the length of s, that b ∈ (Rs ) [{[σ]}] for all s ∈ S. Indeed, for the base case, i σ ∈ Φ and C (σ) ` i σ being an ∈ ∈ ( ϕ,ψ )↓[{[ ]}] LC -derivable sequent imply that i σ b, and thus b Ri σ . For ϕ,ψ ↓ the inductive step, assume that b ∈ (Rs ) [{[σ]}]. Then, every element of ↑ ϕ,ψ ↓ I [(Rs ) [{[σ]}]] contains C (σ). Moreover, we have that i C (σ) ∈ b because i C (σ) ∈ Φ and C (σ) ` i C (σ) is an LC -derivable sequent. Therefore, by Lemma 5.4.9, we have the following:   ↓ h h ii   ↓ ∈ ϕ,ψ ↑ ϕ,ψ  ↓ [{[ ]}] ϕ,ψ [{[ ]}] b Ri I Rs σ = Ri s σ , (5.124)

ϕ,ψ ↓ which concludes the proof that b ∈ (Rs ) [{[σ]}] for all s ∈ S. To nish the ϕ,ψ ↓ proof, for any a, if C (σ) ∈ a, then b ⊆ a. Because (Rs ) [{[σ]}] is Galois- ϕ,ψ ↓ stable for any s ∈ S, this implies that a ∈ (Rs ) [{[σ]}] for every s ∈ S. This goes to show that Mϕ,ψ, a C (σ). 176 Toward an Epistemic Logic of Categories

As to the second item of the lemma,

↑ Mϕ,ψ, x C (σ) i x ∈ [[C (σ)]] (5.125) i x ∩ a , ∅ for all a ∈ [[C (σ)]] (5.126) i C (σ) ∈ x. (5.127)



The weak completeness of LC follows from the lemma above with the usual argument.

Proposition 5.4.24 (Completeness). If ϕ ` ψ is an LC -sequent that is not derivable in LC , then Mϕ,ψ 6|= ϕ ` ψ.

5 5.5. Proposed Formalizations Having dened our epistemic-logical language and oered proofs of its soundness and completeness, we now turn to demonstrating its usefulness for the interdisciplinary research on categories by proposing formalizations of some of the most important notions about categorization in the social sciences. Restricting our focus to the extant literature in sociology [22] and management science [23], our choice falls on the notions of typicality [e.g., 14], contrast [e.g., 18], leniency [e.g., 21], and similarity [e.g., 16]. These repre- sent important theoretical constructs that exert demonstrable inuence on economic and strategic decision-making (cf. Chapters2 and3). By now, they have entered the common lexicon of organization theory. In what follows, we propose formalizations for these constructs using the languages L, LC , and LH previously dened in this chapter. These formalizations should not be interpreted as guidelines on how to measure these constructs empiri- cally, as they rely on information concerning what agents believe about (or how they perceive) objects and categories that may be dicult to acquire outside experimental settings. Their aim is rather to capture the qualitative content of these constructs: in this sense, they are instruments of theory construction, not empirical inquiry.

Typicality. The issue of whether an object a is a representative member of a category ϕ, that is, the extent to which a is typical of ϕ, is key to the similarity-based perspectives on categorization [27, 32]. By extension, it is also central to the literature on conceptual spaces [42, 43, 53], where prototypes play a critical role in determining category memberships [46]. In this literature, the prototype of a category (or concept) is dened as the Proposed Formalizations 177 geometric center of that category: the closer an object is to this point, the greater its typicality as a member of the category (or an instance of the concept). Although FCA is not as naturally suited as conceptual spaces to reason about metric distance,5 the notion of typicality can be captured through the common knowledge-type construction introduced above. The interpretation of C -formulas on models is such that, for every category ϕ, the members of C (ϕ) are those that belong to ϕ according to every agent; moreover, every agent believes that these objects belong to ϕ according to every other agent. This motivates our proposal to regard the members of C (ϕ) as the prototypical members of ϕ. In geometric terms, C (ϕ) is the center of ϕ [cf. 22, p. 42]. The main advantage of this approach is that it directly links typicality to the agents’ perception or beliefs, so that consistently with probabilistic perspectives [e.g., 56], the prototypical objects are those for which membership in the category is never in question. Compare this interpretation of C to the one we proposed 5 in Chapter4: in that case, the operator was used to characterize the outcome of social interaction, whereby the extension and intension of a particular category are agreed upon by the agents. Our current proposal enhances this epistemic interpretation by tying it directly to the psychological [57] and sociological [14] notion of typicality. This is consistent with a probabilistic approach [e.g., 22, p. 40–42] inasmuch as the prototypes of a category (or concept) are those for which the likelihood of being considered a member of the category (or an instance of the concept) is one. There is a number of reasons why an object may fail to be considered a prototypical member of ϕ, the most severe being that some agents do not recognize its membership in ϕ. Alternatively, it may occur that some agents do not believe that every other agent recognizes the object as a member of ϕ. This provides a purely qualitative route to encoding the gradedness of category memberships. That is, the typicality of two objects a and b, represented in the language LH as nominal variables, can be compared in terms of the minimum number of epistemic iterations between the agents that are needed for their typicality test to fail, so that b can be regarded as more typical than a if more iterations are needed for b than for a. This denition can also be used to compare objects that belong to dierent categories, so as to say that b is more typical of ψ than a is of ϕ.

Similarity. The extent to which two categories are similar to one another can be dened in multiple ways. One approach, very naturally applicable to

5It is still suitable to reason about distance [e.g., 54, 55], but not directly from a geometric standpoint [cf. 16]. 178 Toward an Epistemic Logic of Categories

conceptual spaces [e.g., 38], is to calculate the Hausdor distance between the subsets of space that correspond to the two categories, which is equal to the maximum of the two minimal point-to-set distances. Another approach, distinctly set-theoretic in nature, is to use a Jaccard coecient [e.g., 16, 41, 58], which is a ratio of the number of objects in the categories’ intersection over the number of objects in their union. In our algebraic framework, the rst method is not easy to implement but, compatibly with the second, the fact that two categories can have a greater vs. a lower number of members (or features) in common allows us to dene similarity based on the overlap between the categories’ extensions (or descriptions) [cf. 54, 55]. For four categories ϕ, ψ, χ, ξ, we say that ϕ is more similar to ψ than χ is to ξ by means of the sequent ϕ ∨ ψ ` ξ ∨ χ, the sequent ξ ∧ χ ` ϕ ∧ ψ, or by requiring the two sequents to hold simultaneously. The rst sequent intuitively means that ϕ and ψ have more features in common than ξ and χ 5 have. The second means that ϕ and ψ have more members in common than ξ and χ have. Because neither sequent implies nor is implied by the other, it can be useful to consider the information encoded in both, and it may be an interesting question to address in empirical research whether the implications of similarity are identical in these two cases. When instantiated to ϕ = ξ, these conditions can be used to express that ϕ is more similar to ψ than it is to χ.

Contrast. The contrast of a category indicates the extent to which the cat- egory stands out from other categories in the same domain [59]. Intuitively, it quanties the sharpness of category boundaries and it is empirically dened as a function of the typicality of category members [e.g., 19]. In a high-contrast category, objects are considered either very typical mem- bers or not members at all, wheraes in a low-contrast one small grades of memberships are frequent [14]. Contrast is an important property of categories to account for in sociological and management research because the sharpness of category boundaries aects the appeal of product and organizations for their target audiences. In fact, objects that neatly t into sharply bounded categories tend to be more visible and more positively valued [18], whereas those that straddle such boundaries (cf. Chapter3) tend to incur greater penalties [17]. We propose a denition of contrast that relies on the same common knowledge-type construction underpinning the denition of typicality. If ϕ ` C (ϕ) holds for a category ϕ then every member of ϕ is a prototypical member of ϕ in the sense discussed above, and therefore ϕ has maximal contrast. The contrast of ϕ gradually decreases as the category includes Proposed Formalizations 179 more and more members that do not belong to C (ϕ). Using the formal denitions of typicality and similarity, for any two categories ϕ and ψ we say that ϕ has higher contrast than ψ if ϕ is more similar to C (ϕ) than ψ is to C (ψ). That is, we require that ϕ ∨ C (ϕ) ` ψ ∨ C (ψ), that ψ ∨ C (ψ) ` ϕ ∨ C (ϕ), or that both sequents hold simultaneously.

Leniency. By the common sociological denition of contrast [14], the ob- jects in a high-contrast category are highly typical members of the category on average; conversely, the objects in a low-contrast category tend to be more atypical. This can occur either if there are many other categories where the objects also have some positive grade of membership, or if there are few such categories (possibly none). It is important to distinguish between these two cases, as this is indicative of the category’s capacity to accommodate deviations. The notion of leniency serves exactly this purpose [23]: given a category, its leniency is dened as the extent to which 5 the category members tend to belong to additional categories in the same domain. Like contrast, this is an important property to consider in orga- nizational research because a category’s leniency can have an eect on economic [20] and strategic decisions [21]. We say that a category ϕ is non-lenient if its members do not simulta- neously belong to any other category in the formal context. This property is captured by the following condition: for any ψ and χ, if ψ ` ϕ and ψ ` χ, then either ϕ ` χ or χ ` ϕ. To better understand this condition, let us instantiate ψ as the nominal category a, i.e., the category generated by the object a. The sequent a ` ϕ means that a is a member of ϕ, and the non-leniency of ϕ requires that a does not belong to any other category. However, the order-theoretic nature of our logical framework requires a to be a member of every category χ such that ϕ ` χ. As a result, a must at least belong to these logically unavoidable categories. All the categories χ such that a ` χ ` ϕ cannot be excluded either, because the possibility that intermediate categories exist does not solely depend on a and ϕ but also on other objects and features in the formal context. Non-leniency implies that no other categories have a as a member than those within this minimal set of categories that cannot be excluded. For example, a member of the category chamber music is necessarily a member of the superordi- nate category of classical music, but this does mean that chamber music is lenient. Similarly, the fact that some members of classical music also belong to chamber music does not aect the leniency of classical. This denition can also be used for the purpose of comparison. For any two categories ϕ and ψ, we say that ϕ is more lenient than ψ if, for every 180 Toward an Epistemic Logic of Categories

nominal a, if a ` ψ and a ` χ for some χ such that χ 0 ψ and ψ 0 χ, then a ` ϕ; moreover, a ` ξ for some category ξ such that ξ 0 ϕ and ϕ 0 ξ. These conditions can be alternatively given in terms of features (using co-nominal variables) and in terms of modal operators.

5.6. Discussion This chapter presented a basic epistemic logic of categories, expanded it with a common knowledge-type construction and hybrid-style variables, proved its soundness and completeness, and deployed the resulting for- mal machinery to capture some of the most fundamental notions about categories in sociology and management science. The characteristics of our logical formalism are well-suited to reason about the hierarchical nature of classication systems, and in addition to factual information about objects 5 and features, it can represent the subjective perspectives of individual agents as well as social interaction. The propositional base of our logic is the positive (i.e., negation-free and implication-free) fragment of classical propositional logic, which is devoid of distributivity laws. The Kripke-style semantics are given by structures known as formal contexts in FCA [47], which we enriched with binary relations to account for the epistemic inter- pretation of the modal operators. An important dierence between this semantics and the usual Kripke semantics for epistemic logics is that the relations directly encode the viewpoint of the individual agents. The starting point of this chapter was the observation that logic can decisively contribute to the growing research on categories in the social sciences, especially with regard to the analysis of various types of social in- teraction (e.g., epistemic, dynamic, strategic). The prospective contributions are both technical and conceptual in nature. From a technical perspective, this chapter rened the foundational work presented in Chapter4 linking epistemic logic and FCA. This preliminary work oered an intuitive explana- tion of the denition of the interpretation clauses of L-formulas on certain enriched formal contexts, whereas the treatment we presented here adapts these clauses to the more general and natural class of arbitrary (enriched) formal contexts. One of the novel aspects of this proposal is that the typi- cality of objects is captured via a common knowledge-type operator that is semantically equivalent to the usual greatest xed point construction. This paves the way to the use of logical languages expanded with xed point operators to model the more subtle aspects of agents’ knowledge, such as introspection. In addition, our framework makes it possible to blend syllogistic and epistemic reasoning. Specic proof calculi are needed to Discussion 181 further analyze the aspects connected to reasoning and deduction in L and LC , and to explore the computational properties of these logics. As these calculi can be used to draw conclusions from formal inferences, they can also be helpful to empirical research on categories by allowing the derivation of testable hypotheses. From a more conceptual standpoint, the formalism introduced in Chap- ter4 and further rened in this chapter can be instrumental to the growing research on categories in sociology and management science. An adequate account of the dynamic nature of categories is one of the major challenges faced by researchers in these elds [13, 23]: by enabling classication sys- tems to be represented as partial orders where changes due to the addition of new products or features can be formally computed, our framework can illuminate the mechanisms whereby new market categories emerge [60–62] or disappear [2, 63]. Another challenge is that categories both shape and are shaped by social interaction [64]. While this bidirectional causality is 5 empirically problematic to account for, it is theoretically interesting and arguably essential to understand just how categories contribute to the functioning of markets. A compelling direction for further study concerns precisely the way categories aect agents’ patterns of interaction [cf. 65], and how these patterns, in turn, modify the agents’ beliefs. An important step in this direction would be to expand our framework with dynamic modalities and to extend the construction of dynamic updates [66, 67] to models based on enriched formal contexts. Recent work [68] has shown that these dynamic formalisms can also be incorporated in settings where the agents’ beliefs are probabilistic [e.g., 22]. Another avenue for further study relates to the observation that cat- egorizing agents, such as investors, analysts, or consumers in a market, tend to be motivated by dierent goals [69]. The needs and desires agents have in mind when sorting products or organizations can inuence the features they nd worthy to consider [70, 71]. Moreover, goals can play a crucial role in category emergence by shaping consensus about which objects deserve membership in a new category [72]. As explained in Chapter 3, it can be dicult to properly account for goal-based categories given their lack of a prototypical structure [26]: extant formal approaches that build on similarity-based views of categorization, like conceptual spaces, are not necessarily the best option. The framework we proposed can be much more suitable because, while it takes prototypes into account by allowing formal concepts to be generated by specic objects, it does not assume prototypes to serve as the sole or even the most important drivers of category generation [cf. 46]. Indeed, formal concepts can also coalesce 182 Toward an Epistemic Logic of Categories

around specic features or arbitrary combinations thereof: for this reason, our logic holds promise to reconcile dierent theories of categorization.

5.7. References [1] B. A. Davey and H. A. Priestley, Introduction to Lattices and Order (Cam- bridge University Press, New York, NY, 2002). [2] M. T. Kennedy and P. C. Fiss, An ontological turn in categories research: From standards of legitimacy to evidence of actuality, Journal of Man- agement Studies 50, 1138 (2013). [3] W. Croft, Syntactic Categories and Grammatical Relations: The Cognitive Organization of Information (University of Chicago Press, Chicago, IL, 1991). [4] D. N. Osherson, E. E. Smith, O. Wilkie, A. López, and E. Shar, Category- 5 based induction, Psychological Review 97, 185 (1990). [5] J. Feldman, Minimization of Boolean complexity in human concept learning, Nature 407, 630 (2000). [6] G. L. Murphy, The Big Book of Concepts (MIT Press, Cambridge, MA, 2002). [7] C. M. Bishop, Pattern Recognition and Machine Learning (Springer, Berlin/Heidelberg, Germany, 2006). [8] T. Joachims, Text categorization with Support Vector Machines: Learning with many relevant features, in Machine Learning, Lecture Notes in Computer Science, Vol. 1398, edited by C. Nédellec and C. Rouveirol (Springer, Berlin/Heidelberg, Germany, 2005) pp. 137–142. [9] J. F. Sowa, Knowledge Representation: Logical, Philosophical, and Com- putational Foundations (Brooks/Cole, Pacic Grove, CA, 1999). [10] J. A. Rosa, J. F. Porac, J. Runser-Spanjol, and M. S. Saxon, Sociocognitive dynamics in a product market, Journal of Marketing 63, 64 (1999). [11] J. F. Porac and H. Thomas, Taxonomic mental models in competitor denition, Academy of Management Review 15, 224 (1990). [12] R. K. Reger and A. S. Hu, Strategic groups: A cognitive perspective, Strategic Management Journal 14, 103 (1993). [13] G. Cattani, J. F. Porac, and H. Thomas, Categories and competition, Strategic Management Journal 38, 64 (2017). [14] M. T. Hannan, Partiality of memberships in categories and audiences, Annual Review of Sociology 36, 159 (2010). [15] M. D. Leung, Dilettante or Renaissance person? How the order of job experiences aects hiring in an external labor market, American Socio- logical Review 79, 136 (2014). References 183

[16] B. Kovács and M. T. Hannan, Conceptual spaces and the consequences of category spanning, Sociological Science 2, 252 (2015). [17] B. Kovács and M. T. Hannan, The consequences of category spanning depend on contrast, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 175–201. [18] G. Negro, M. T. Hannan, and H. Rao, Categorical contrast and audience appeal: Niche width and critical success in winemaking, Industrial and Corporate Change 19, 1397 (2010). [19] J. G. Kuilman and F. C. Wezel, Taking o: Category contrast and or- ganizational mortality in the U.K. airline industry, 1919–64, Strategic Organization 11, 56 (2013). [20] E. G. Pontikes, Two sides of the same coin: How ambiguous classi- cation aects multiple audiences’ evaluations, Administrative Science 5 Quarterly 57, 81 (2012). [21] E. G. Pontikes and W. P. Barnett, The persistence of lenient market categories, Organization Science 26, 1415 (2015). [22] M. T. Hannan, G. Le Mens, G. Hsu, B. Kovács, G. Negro, L. Pólos, E. G. Pontikes, and A. J. Sharkey, Concepts and Categories: Foundations for Sociological Analysis, unpublished manuscript (2017). [23] J.-P. Vergne and T. Wry, Categorizing categorization research: Review, integration, and future directions, Journal of Management Studies 51, 56 (2014). [24] E. E. Smith and D. L. Medin, Categories and Concepts (Harvard University Press, Cambridge, MA, 1981). [25] D. H. Fisher, Knowledge acquisition via incremental conceptual cluster- ing, Machine Learning 2, 139 (1987). [26] L. W. Barsalou, Ideals, central tendency, and frequency of instantiation as determinants of graded structure in categories, Journal of Experi- mental Psychology: Learning, Memory, and Cognition 11, 629 (1985). [27] E. H. Rosch, Principles of categorization, in Cognition and Categoriza- tion, edited by E. H. Rosch and B. B. Lloyd (Lawrence Erlbaum Associates, Hillsdale, NJ, 1978) pp. 27–48. [28] E. H. Rosch, Cognitive reference points, Cognitive Psychology 7, 532 (1975). [29] E. H. Rosch and C. B. Mervis, Family resemblances: Studies in the internal structure of categories, Cognitive Psychology 7, 573 (1975). [30] E. E. Smith and D. L. Medin, The exemplar view, in Concepts: Core Readings, edited by E. Margolis and S. Laurence (MIT Press, Cambridge, 184 Toward an Epistemic Logic of Categories

MA, 1999) pp. 207–221. [31] D. L. Medin and E. E. Smith, Concepts and concept formation, Annual Review of Psychology 35, 113 (1984). [32] D. L. Medin, Concepts and conceptual structure, American Psychologist 44, 1469 (1989). [33] N. H. Feldman, T. L. Griths, and J. L. Morgan, The inuence of categories on perception: Explaining the perceptual magnet eect as optimal statistical inference, Psychological Review 116, 752 (2009). [34] D. N. Osherson and E. E. Smith, On the adequacy of prototype theory as a theory of concepts, Cognition 9, 35 (1981). [35] G. L. Murphy and D. L. Medin, The role of theories in conceptual coher- ence, Psychological Review 92, 289 (1985). [36] L. W. Barsalou, Ad hoc categories, Memory & Cognition 11, 211 (1983). [37] L. W. Barsalou, Deriving categories to achieve goals, in The Psychology 5 of Learning and Motivation: Advances in Research and Theory, Vol. 27, edited by G. H. Bower (Academic Press, Cambridge, MA, 1991) pp. 1–64. [38] E. G. Pontikes and M. T. Hannan, An ecology of social categories, Socio- logical Science 1, 311 (2014). [39] G. Le Mens, M. T. Hannan, and L. Pólos, Organizational obsolescence, drifting tastes, and age dependence in organizational life chances, Organization Science 26, 550 (2015). [40] G. Le Mens, M. T. Hannan, and L. Pólos, Age-related structural inertia: A distance-based approach, Organization Science 26, 756 (2015). [41] A. Goldberg, M. T. Hannan, and B. Kovács, What does it mean to span cultural boundaries? Variety and atypicality in cultural consumption, American Sociological Review 81, 215 (2016). [42] P. Gärdenfors, Conceptual Spaces: The Geometry of Thought (MIT Press, Cambridge, MA, 2000). [43] P. Gärdenfors, The Geometry of Meaning: Semantics Based on Concep- tual Spaces (MIT Press, Cambridge, MA, 2014). [44] J. Zwarts, Conceptual spaces, features, and word meanings: The case of Dutch shirts, in Applications of Conceptual Spaces: The Case for Geometric Knowledge Representation, edited by P. Gärdenfors and F. Zenker (Springer, Berlin/Heidelberg, Germany, 2015) pp. 57–78. [45] P. Gärdenfors, Semantics, conceptual spaces, and the dimensions of music, in Essays on the Philosophy of Music, edited by V. Rantala, L. E. Rowell, and E. Tarasti (Philosophical Society of Finland, Helsinki, Fin- land, 1988) pp. 9–27. [46] P. Gärdenfors and F. Zenker, Conceptual spaces at work, in Applications of Conceptual Spaces: The Case for Geometric Knowledge Representa- References 185

tion, edited by P. Gärdenfors and F. Zenker (Springer, Berlin/Heidelberg, Germany, 2015) pp. 3–13. [47] B. Ganter and R. Wille, Formal Concept Analysis: Mathematical Founda- tions (Springer, Berlin/Heidelberg, Germany, 1999). [48] G. Birkho, Lattice Theory (American Mathematical Society, New York, NY, 1940). [49] E. H. Rosch, C. B. Mervis, W. D. Gray, D. M. Johnson, and P. Boyes-Braem, Basic objects in natural categories, Cognitive Psychology 8, 382 (1976). [50] R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi, Reasoning About Knowledge (MIT Press, Cambridge, MA, 2003). [51] L. Pólos, M. T. Hannan, and G. Hsu, Modalities in sociological arguments, Journal of Mathematical Sociology 34, 201 (2010). [52] S. M. Fleming, L. T. Maloney, and N. D. Daw, The irrationality of categor- ical perception, Journal of Neuroscience 33, 19060 (2013). [53] D. Widdows, Geometry and Meaning (CSLI Publications, Stanford, CA, 5 2004). [54] A. Formica, Ontology-based concept similarity in Formal Concept Anal- ysis, Information Sciences 176, 2624 (2006). [55] A. Formica, Concept similarity in Formal Concept Analysis: An informa- tion content approach, Knowledge-Based Systems 21, 80 (2008). [56] L. E. Bourne, Typicality eects in logically dened categories, Memory & Cognition 10, 3 (1982). [57] E. H. Rosch, C. Simpson, and R. S. Miller, Structural bases of typicality eects, Journal of Experimental Psychology: Human Perception and Performance 2, 491 (1976). [58] M. Keuschnigg and T. Wimmer, Is category spanning truly disadvan- tageous? New evidence from primary and secondary movie markets, Social Forces 96, 449 (2017). [59] M. T. Hannan, L. Pólos, and G. R. Carroll, Logics of Organization Theory: Audiences, Codes, and Ecologies (Princeton University Press, Princeton, NJ, 2007). [60] M. Khaire and R. D. Wadhwani, Changing landscapes: The construction of meaning and value in a new market category—Modern Indian art, Academy of Management Journal 53, 1281 (2010). [61] C. Navis and M. A. Glynn, How new market categories emerge: Temporal dynamics of legitimacy, identity, and entrepreneurship in satellite radio, 1990–2005, Administrative Science Quarterly 55, 439 (2010). [62] R. Durand and M. Khaire, Where do market categories come from and how? Distinguishing category creation from category emergence, Jour- nal of Management 43, 87 (2017). 186 Toward an Epistemic Logic of Categories

[63] M. T. Kennedy, Y.-C. Lo, and M. Lounsbury, Category currency: The changing value of conformity as a function of ongoing meaning con- struction, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 369–397. [64] Ö. Koçak, M. T. Hannan, and G. Hsu, Emergence of market orders: Audience interaction and vanguard inuence, Organization Studies 35, 765 (2014). [65] N. P. Mark, Birds of a feather sing together, Social Forces 77, 453 (1998). [66] A. A. Kurz and A. Palmigiano, Epistemic updates on algebras, Logical Methods in Computer Science 9, 1 (2013). [67] M. Ma, A. Palmigiano, and M. Sadrzadeh, Algebraic semantics and model completeness for intuitionistic public announcement logic, Annals of Pure and Applied Logic 165, 963 (2014). 5 [68] W. Conradie, S. Frittella, A. Palmigiano, and A. Tzimoulis, Probabilistic epistemic updates on algebras, in Logic, Rationality, and Interaction, Lecture Notes in Computer Science, Vol. 9394, edited by W. van der Hoek, W. H. Holliday, and W.-F. Wang (Springer, Berlin/Heidelberg, Germany, 2015) pp. 64–76. [69] S. Ratneshwar, L. W. Barsalou, C. Pechmann, and M. Moore, Goal-derived categories: The role of personal and situational goals in category representations, Journal of Consumer Psychology 10, 147 (2001). [70] R. Durand and L. Paolella, Category stretching: Reorienting research on categories in strategy, entrepreneurship, and organization theory, Journal of Management Studies 50, 1100 (2013). [71] L. Paolella and R. Durand, Category spanning, evaluation, and perfor- mance: Revised theory and test on the corporate law market, Academy of Management Journal 59, 330 (2016). [72] N. Granqvist and T. Ritvala, Beyond prototypes: Drivers of market cate- gorization in functional foods and nanotechnology, Journal of Manage- ment Studies 53, 210 (2016). 6 Conclusions

187 188 Conclusions

6.1. Summary of Findings This dissertation sought to contribute to the growing research on categories in organization theory by examining how the information encoded by cate- gory labels is decoded by agents for the purpose of decision-making. In Chapters2 and3, we engaged with this question empirically by analyzing how the categories’ internal structures, i.e., the rules underlying objects’ category memberships [1], and their external structures, i.e., their hierarchi- cal ordering into a classication system [2], aect the strategic behavior of organizations and the evaluation of products by customers. These studies equipped us a more thorough understanding of how agents with limited cog- nitive resources use category labels to make better decisions. In Chapters 4 and5, we built on these insights to develop a formal theory of categoriza- tion inspired by Formal Concept Analysis (FCA) [3, 4] and similarly based on the notions of lattices and order [5]. After enriching FCA’s basic framework with additional constructions that allow an epistemic interpretation, we introduced a sound and complete logical language capable of describing categories in terms of their hierarchical linkage and informational content. In the next few paragraphs, we recap the main ndings from these chapters, 6 discuss their implications for the themes outlined in Chapter1, and identify directions for further research. Chapter2 dealt with the consequences of product proliferation. This strategy has been analyzed extensively in industrial economics for its ca- pacity to deter competition in (sub)markets [6]; however, the recurrent suggestion from game-theoretic models that product proliferation is eec- tive at deterring rival product introductions [7–12] was not supported by previous empirical studies [13]. Suspecting this could be due to structural variation across submarkets or categories, we proposed that the complexity of the region of feature space to which a category maps, i.e., the level of heterogeneity in the features of category members [cf. 14], weakens the deterrent power of proliferation strategies and can ultimately cancel their eect on competition. To measure category complexity, we observed the hierarchical relations within the industry’s classication system [cf. 15]: from this perspective, a more complex category is one where the category members are scattered across a more diverse set of subcategories. Consistently with our predictions, quasi-Poisson estimates suggested that the more a category is complex, the less a rm’s strategic behavior aects the positioning choices of rivals. Additional tests revealed that this eect mainly concerns organizations who can safely shift their attention to other regions of the feature space, whereas those that are constrained to particular regions (e.g., by size or specialization) remain undeterred. Summary of Findings 189

There is thus a second reason why product proliferation can sometimes appear to be ineective at keeping rivals at bay, which relates to the ri- vals’ capacity to handle environmental change [16]. Beyond suggesting theoretically-grounded explanations for previous studies’ failure to sup- port the “deterrence hypothesis” [13, p. 149], our study demonstrates that the information encoded into category labels—in this case, the variance of features among category members—aects competitive interactions. In Chapter3, we turned our attention to the eects of category spanning. Organizational research on this topic has been feverish ever since Zucker- man’s compelling study [17]. Although many analyzed the consequences of category spanning before, particularly within the domain of organiza- tional ecology [18], most scholars assumed categories to possess a specic internal structure, namely a prototypical one [19], and did not consider the possibility that categorization in markets may also depend on other criteria [20–23]. Building on the psychological literatures on goal-based cat- egories [1] and cross-classication [24], we dissected the process whereby consumers combine information from multiple (and internally dierent) category labels to derive expectations about products. In particular, we considered the implications of multiple memberships in a system where 6 categories are based on prototypes and one where they are based on con- sumers’ goals. Tracing the negative eect of uncertainty on the perceived value of products [cf. 25, 26], we identied conditions in which multiple category labels enhance or impair the consumers’ ability to make accurate inferences about the products’ location in a feature space. Our mixed-eects regression models returned evidence that having a single category label in each classication system minimizes consumers’ uncertainty about the product’s location and thus maximizes their average evaluation, which is consistent with extant theory [27]. However, we also found evidence that, because of the geometric convexity of type-based categories, it is possible to pinpoint a product’s location if there is a suf- ciently large number of type-based category labels. Further, we found that the actual number of type-based labels required for this geometric derivation decreases with the number of goal-based categories to which the product is assigned. These results point to the possibility of reconciling divergent ndings about the eects of category spanning [e.g., 28, 29] under a relatively parsimonious set of theoretical assumptions. The principal implication of these two empirical chapters is that or- ganization theorists cannot aord to ignore the inuence of categories’ hierarchical arrangement into a classication system, nor the dierent cognitive mechanisms through which category labels encode factual in- 190 Conclusions

formation. Accordingly, in the following chapters we set out to develop a formal theory of categorization that properly captures these key aspects. With regard to formal models, a trend is currently building in sociological research to rely on the notion of conceptual spaces [30, 31] and thus em- brace a geometric approach to category representation in social domains like markets [32–35]. There is great merit to this framework as it allows for an intuitive depiction of categories as multidimensional concepts; however, it is heavily geared toward prototype theory [36] and consequently it is ill-equipped to accommodate other rationales for categorization, such as the goal-driven [37] or the theory-based [38], which can be just as crucial as prototypes to explain agents’ reasoning [39]. Perhaps unbeknown to sociologists, an alternative framework exists that is better suited to account for dierent mechanisms for categorization, i.e., Formal Concept Analysis [4]. This approach is the keystone of our logical formalism. Chapter4 was entirely devoted to connecting the mathematical machin- ery underlying FCA to the theoretical insights on categories in organizational contexts. As discussed in previous research [40], the basic FCA framework already allows for the characterization of categories as elements of a formal 6 context, or concept lattice, by virtue of Birkho’s representation theorem [41]. Our adaptation to organization theory involved the denition of re- lational semantics that enrich the FCA framework with relations used to interpret modal operators on normal lattice expansions. These modalities capture the categories’ informational content by encoding the incomplete, idiosyncratic, and possibly mistaken beliefs of agents vis-à-vis the objects that belong to a given category or, equivalently, the features that the cate- gory members share. We further reasoned that social interaction allows agents in a market to learn about the beliefs of other agents, eventually leading to the identication of a small set of categories whose meaning is consensual. We formalized the emergence of real categories from the much broader set of candidate (i.e., merely possible) categories by means of a xed-point construction that is reminiscent of the denition of common knowledge in classical epistemic logic [42]. Owing to these language enrichments, our formal theory represents market categories as naturally embedded in a hierarchical system, as per the basic FCA framework, and it enables one to describe their meaning in a way that is independent from (but compatible with) the notion of prototypes. In addition, and consistently with extant research [43], our theory describes category emergence as simultaneously contingent on factual information, subjective beliefs or perceptions, and social interaction. In future research, this formalism will be extended so as to allow for epistemic updates [44–46], Implications for the Themes 191 which enables us to capture category emergence in a truly dynamic fashion, compatibly with how this process is described by researchers in sociology [e.g., 47] and management [e.g., 48]. Chapter5 rounded o the formal part of this dissertation by general- izing our theory and developing the epistemic logic associated with our lattice-based framework. After clarifying the semantics, axiomatization, and language enrichments of this logic, we oered proof of its soundness and completeness.1 Our epistemic logic can be fruitfully applied to real- world contexts: to demonstrate the value of this application, we proposed formalizations for four theoretical constructs relevant to the research on categories in sociology [cf. 35] and management science [cf. 49]: typicality, contrast, leniency, and similarity. Given their order-theoretic avor, we be- lieve these formalizations can be useful to generate advanced theory about the role of these constructs in economic and strategic decision-making.

6.2. Implications for the Themes Some general conclusions can be drawn with regard to the four themes presented in Chapter1. The rst theme relates to the principle of cognitive 6 economy, which states that the primary objective of any categorization is to reduce the potentially overwhelming inux of stimuli to behaviorally and cognitively manageable proportions [50]. Consistently with this rule, our empirical results corroborate the argument that rm managers use categories to organize their view of the competitive landscape. To this end, they sort consumers into segments, or niches [51], and products into categories, so as to obtain a cognitive map that facilitates their strategic decisions. At the same time, consumers use category labels to predict the features of new products, hence lowering the cost of the information they require to make optimal purchasing decisions [52]. Even when considering products with multiple labels, their inferences about features obey the principle of cognitive economy. In this sense, our ndings are consistent with a rational model of categorization [53, 54]. We also found support for the notion that, when confronted with multiple labels, consumers reason about the necessity and the impossibility of certain feature combinations [55], looking for an “economical” explanation of their category labels. With respect to our second theme, which concerns the structure of classication systems, our research suggests that agents in a market can exploit their knowledge of the categories’ external [2] and internal struc-

1These two properties of the logic respectively imply that all which can be logically derived within our formal system is true, and that all which is true can be logically derived. 192 Conclusions

tures [1] to make better decisions. In particular, rm managers can leverage the categories’ hierarchical relations to predict whether their competitive actions will exert the eect they intend. Consumers, instead, can consider the rules that govern category memberships to more accurately infer the characteristics of products. When the categories are governed by family resemblance, reasoning about distance and its implications [cf. 33] can be especially useful to derive information about a product’s features. Never- theless, categories that do not possess a prototypical structure can still play a role because they are often deployed in conjunction with those based on prototypes [cf. 24] so as to complement their information. It is impor- tant for organization theorists to take these dierent classication systems into consideration. Admittedly, goal-based categories can be dicult to observe empirically as they are seldom institutionalized and thus, for the most part, they remain implicit; however, social scientists’ familiarization with machine learning [56] will undoubtedly open up unprecedented op- portunities to extract these categories from discourse. On a similar note, goal-based categories can be immensely relevant to innovation research as they have been suggested to drive the consolidation of new markets [22]. 6 Further, they are relevant to the eld of strategic management because their non-prototypical structure can explain why rms that oer relatively dierent products still tend to perceive one another as competitors [57]. Unfortunately, while the limitations of a prototype-centered approach have long been acknowledged in cognitive psychology [38, 39] and consumer research [58], the alternatives proposed over the years remain heavily under-represented in management-related disciplines [20]. The third theme addressed in this dissertation relates to category dy- namics. Sociologists concur that the meaning of market categories can change over time to keep up with shifts in the domain the category la- bels are intended to map [32, 47, 59–61]. This has a twofold implication for organizational research: First, considering the informational content of classication systems as time-variant is necessary to correctly estimate the eects of categorization on competitive and strategic outcomes. In accord with this necessity, the empirical part of our research leaned to- ward a dynamic treatment of category properties, but we also decoupled properties that are subject to periodic change, such as complexity, from those that are relatively enduring, like the presence or the absence of a geometric center. Second, changes in classication systems represent soci- ologically relevant events in and of themselves, and thus they are worthy of researchers’ attention. The formal part of this dissertation aimed to advance our understanding of category emergence precisely by formalizing Implications for the Themes 193 its sociocognitive determinants [43, 62]. While our formalism does not (yet) account for dynamic updates, it already highlights that new categories may arise from changes in agents’ beliefs about the features shared by the objects in the domain—regardless of whether these beliefs are correct—and by their patterns of interaction. Firm managers are well advised to monitor these aspects in the markets or submarkets where they operate, as changes in categories can generate variance to the eects of their strategies and in the perceived value of their oerings. The fourth and nal theme concerns the utility of logical methods in organization theory. Although logical formalizations are not new to the literature on categories [63], they are hardly considered essential to the purpose of theory construction. Contrariwise, we argued that these meth- ods are especially important at this stage of theory development, as the eld is slowly but surely consolidating into a self-standing research domain [49]. Precisely at this moment, the task of formalizing divergent theories of categorization, assessing their compatibility, and evaluating their potential for future unication appears urgent. In this thesis, we laid the founda- tions of an epistemic-logical theory of categorization that builds on formal methods commonly used in information science [64–66]. The approach we 6 proposed is advantageous in that it naturally accounts for categories that do not possess a prototypical structure, provides a exible way to represent conceptual combinations [cf. 67, 68], and favors an ontological perspective on classication systems [3]. For obvious reasons, we believe this proposal addresses organization theorists’ call for an “ontological turn in categories research” [69]. By providing an order-theoretic view that elegantly captures hierarchical relations, our formalism also complements the increasingly popular approach based on conceptual spaces, which is rather more geared toward prototypes and distance. It is worth remarking once again that the cross-fertilization between logic and organization theory is not only generative for organization schol- ars but also for logicians, because the diculties inherent to the formal- ization of social phenomena [70–72] can lead to nontrivial mathematical results. This dissertation attests to such mutual benets: in fact, our ap- plication of Formal Concept Analysis to the study of categories in markets and its subsequent enrichment with additional relations to interpret modal operators contributed to research in logic by yielding a much more intu- itive characterization of the RS-semantics of lattice-based modal logic [40]. Moreover, our eort to free this lattice-based framework from the restric- tions imposed by the RS-conditions helped clarifying the link between RS-frames and more general Kripke-style structures. 194 Conclusions

6.3. Limitations and Further Research Some empirical and theoretical limitations arose in the course of our re- search that could not be resolved within the scope of this dissertation but nonetheless require further attention. To begin with, category emergence was considered in our logical formalizations but it was not studied empir- ically. To some extent, this is because the appearance of new categories tends to be a sparse and subtle phenomenon—indeed, classication sys- tems would not be very useful to decision-making if they were subject to frequent or sudden revision. Still, new categories regularly emerge and consolidate over time, however slowly, and this can complicate their em- pirical treatment in studies using archival data. Controlling for category properties in a time-variant fashion, as we did in our empirical chapters, partly addresses this problem by allowing researchers to account for subtle changes in category meanings, but it may not be enough when studying markets where the classication system itself can be emergent or “in ux” [73]. In these contexts, extracting categories from agents’ discourse [e.g., 74] can be more eective than relying on archival sources. A second empirical limitation is that some of the arguments presented 6 in the rst half of this thesis, such as the geometric derivation of products’ location in a feature space based on third-party categorization, concern cognitive mechanisms that are dicult to test directly with our data. In point of fact, auxiliary assumptions were required in Chapter3 to derive testable hypotheses. This is not necessarily a problem as long as the assumptions are reasonable in the chosen empirical setting; however, stronger evidence could be garnered for our arguments via experimental research designs. Testing cognate hypotheses in controlled experiments thus represents a promising next step. On the formal side, there are two main shortcoming to be addressed in future research: First, as mentioned before, our proposal to model the emergence of categories through social interaction necessitates the ad- dition of dynamic updates [44, 46]. In this regard, further extensions of our formalism are already underway. Second, we did not yet establish a systematic connection between the order-theoretic representation of cate- gories in FCA, which builds on the theory of lattices, and their geometric representation in conceptual spaces, which pivots on the notion of dis- tance. We believe these two approaches are complementary insofar as they emphasize dierent but non-mutually exclusive aspects of categorization. More than this, we suspect they can be unied into a single, comprehensive framework [cf. 75] using techniques from correspondence theory [76]. In fact, previous studies were successful in developing distance measures Limitations and Further Research 195 for concepts in FCA [64, 66]: the establishment of a connection, therefore, appears both possible and mathematically interesting. Another purely theoretical direction for further research would be to examine the implications of our formal theory with the aid of computa- tional models. For example, agent-based simulations [77] could be used to study how the attributes of classication systems, such as the number of categories they include or the number of hierarchical levels they comprise, depend on the number of objects, features, and agents, but also on the exactness of the agents’ perceptions, their degree of access to reality, and their level of consensus. For practical purposes, such simulations would allow us to analyze the eects on a classication system’s stability of, inter alia, various kinds of innovation, represented by the addition of new ob- jects, new features, or new combinations of features, and market growth, which can be represented via the addition of new agents. Finally, it could be helpful to explore the eects wrought by increases or decreases in the relative importance of particular groups of agents, who can be taken to represent inuencers or selectors [78]. Given our express interest in formal theory, and especially logic, a nal reection is in order with regard to the use of logical methods in organi- 6 zational research. As noted by many previous studies in sociology [e.g., 72, 79] and management [e.g., 80], the role of pure mathematics in this eld can be invaluable. However, the application and appreciation of logical tools requires training that is not normally available to social scientists. On the contrary, students of these disciplines are encouraged early on in their education to dismiss technical knowledge [81]: this is regrettable, not only because the abstraction enabled by logic contributes to the dis- cipline’s scientic rigor [82], but also because depriving social science of the vital iteration between (formal) theory and empirical inquiry can cause researchers to miss important opportunities for improvement. The contri- bution of this feedback loop to the generation of scientic knowledge was perhaps most eectively described by statistician George Box [83, p. 792]:

For the theory-practice iteration to work, the scientist must be, as it were, mentally ambidextrous; fascinated equally on the one hand by possible meanings, theories, and tentative models to be induced from data and the practical reality of the real world, and on the other with the factual implications deducible from tentative theories, models, and hypotheses. [...] Mathematics artfully employed can then enable him to derive the logical consequences of his tentative hypotheses and his 196 Conclusions

strategically selected environment will allow him to compare these consequences with practical reality.

Such continuous iteration is crucial to the production of theory that is “conceptually interesting, empirically generative, or practically successful” [82, p. 118]. In fact, it is precisely the discrepancy between what a theory predicts to be true and what empirical inquiry shows to be the case that fuels the development of better theory and more accurate tests. Moreover, having formalized theoretical arguments guide the design of empirical anal- yses helps ensuring that the evidence uncovered is relevant to the theory in question. In this dissertation, we oered a demonstration of this dual approach by suggesting a novel, unconventional application for advanced mathematical formalisms to the organizational research on categories. We hope that the “marriage” hereby proposed between logic and organization theory piques the curiosity of logicians and organization scholars alike.

6.4. References [1] L. W. Barsalou, Ideals, central tendency, and frequency of instantiation 6 as determinants of graded structure in categories, Journal of Experi- mental Psychology: Learning, Memory, and Cognition 11, 629 (1985). [2] E. H. Rosch, C. B. Mervis, W. D. Gray, D. M. Johnson, and P. Boyes-Braem, Basic objects in natural categories, Cognitive Psychology 8, 382 (1976). [3] B. Ganter and R. Wille, Applied lattice theory: Formal Concept Analy- sis, in General Lattice Theory, edited by G. Grätzer (Birkhäuser, Basel, Switzerland, 1997) pp. 591–606. [4] B. Ganter and R. Wille, Formal Concept Analysis: Mathematical Founda- tions (Springer, Berlin/Heidelberg, Germany, 1999). [5] B. A. Davey and H. A. Priestley, Introduction to Lattices and Order (Cam- bridge University Press, New York, NY, 2002). [6] R. E. Caves and M. E. Porter, From entry barriers to mobility barriers: Conjectural decisions and contrived deterrence to new competition, Quarterly Journal of Economics 91, 241 (1977). [7] R. Schmalensee, Entry deterrence in the ready-to-eat breakfast cereal industry, Bell Journal of Economics 9, 305 (1978). [8] R. W. Shaw, Product proliferation in characteristics space: The U.K. fertiliser industry, Journal of Industrial Economics 31, 69 (1982). [9] G. Bonanno, Location choice, product proliferation, and entry deter- rence, Review of Economic Studies 54, 37 (1987). [10] R. S. Raubitschek, A model of product proliferation with multiproduct rms, Journal of Industrial Economics 35, 269 (1987). References 197

[11] S. Bhatt, Strategic product choice in dierentiated markets, Journal of Industrial Economics 36, 207 (1987). [12] R. J. Gilbert and C. Matutes, Product line rivalry with brand dierentia- tion, Journal of Industrial Economics 41, 223 (1993). [13] B. L. Bayus and W. P. Putsis, Product proliferation: An empirical analysis of product line determinants and market outcomes, Marketing Science 18, 137 (1999). [14] A. Barroso and M. S. Giarratana, Product proliferation strategies and rm performance: The moderating role of product space complexity, Strategic Management Journal 34, 1435 (2013). [15] T. Wry and M. Lounsbury, Contextualizing the categorical imperative: Category linkages, technology focus, and resource acquisition in nan- otechnology entrepreneurship, Journal of Business Venturing 28, 117 (2013). [16] M. T. Hannan and J. Freeman, The population ecology of organizations, American Journal of Sociology 82, 929 (1977). [17] E. W. Zuckerman, The categorical imperative: Securities analysts and the illegitimacy discount, American Journal of Sociology 104, 1398 (1999). [18] M. T. Hannan, Partiality of memberships in categories and audiences, 6 Annual Review of Sociology 36, 159 (2010). [19] E. H. Rosch and C. B. Mervis, Family resemblances: Studies in the internal structure of categories, Cognitive Psychology 7, 573 (1975). [20] R. Durand and L. Paolella, Category stretching: Reorienting research on categories in strategy, entrepreneurship, and organization theory, Journal of Management Studies 50, 1100 (2013). [21] T. Wry, M. Lounsbury, and P. D. Jennings, Hybrid vigor: Securing ven- ture capital by spanning categories in nanotechnology, Academy of Management Journal 57, 1309 (2014). [22] N. Granqvist and T. Ritvala, Beyond prototypes: Drivers of market cat- egorization in functional foods and nanotechnology, Journal of Man- agement Studies 53, 210 (2016). [23] L. Paolella and R. Durand, Category spanning, evaluation, and perfor- mance: Revised theory and test on the corporate law market, Academy of Management Journal 59, 330 (2016). [24] B. H. Ross and G. L. Murphy, Food for thought: Cross-classication and category organization in a complex real-world domain, Cognitive Psychology 38, 495 (1999). [25] G. A. Akerlof, The market for “lemons:” Quality uncertainty and the market mechanism, Quarterly Journal of Economics 84, 488 (1970). [26] P. Nelson, Information and consumer behavior, Journal of Political 198 Conclusions

Economy 78, 311 (1970). [27] G. Hsu, M. T. Hannan, and Ö. Koçak, Multiple category memberships in markets: An integrative theory and two empirical tests, American Sociological Review 74, 150 (2009). [28] M. D. Leung and A. J. Sharkey, Out of sight, out of mind? Evidence of perceptual factors in the multiple-category discount, Organization Science 25, 171 (2013). [29] J. Merluzzi and D. J. Phillips, The specialist discount: Negative returns for MBAs with focused proles in investment banking, Administrative Science Quarterly 61, 87 (2016). [30] P. Gärdenfors, Conceptual Spaces: The Geometry of Thought (MIT Press, Cambridge, MA, 2000). [31] P. Gärdenfors, The Geometry of Meaning: Semantics Based on Concep- tual Spaces (MIT Press, Cambridge, MA, 2014). [32] E. G. Pontikes and M. T. Hannan, An ecology of social categories, Socio- logical Science 1, 311 (2014). [33] B. Kovács and M. T. Hannan, Conceptual spaces and the consequences of category spanning, Sociological Science 2, 252 (2015). 6 [34] A. Goldberg, M. T. Hannan, and B. Kovács, What does it mean to span cultural boundaries? Variety and atypicality in cultural consumption, American Sociological Review 81, 215 (2016). [35] M. T. Hannan, G. Le Mens, G. Hsu, B. Kovács, G. Negro, L. Pólos, E. G. Pontikes, and A. J. Sharkey, Concepts and Categories: Foundations for Sociological Analysis, unpublished manuscript (2017). [36] P. Gärdenfors and F. Zenker, Conceptual spaces at work, in Applications of Conceptual Spaces: The Case for Geometric Knowledge Representa- tion, edited by P. Gärdenfors and F. Zenker (Springer, Berlin/Heidelberg, Germany, 2015) pp. 3–13. [37] L. W. Barsalou, Deriving categories to achieve goals, in The Psychology of Learning and Motivation: Advances in Research and Theory, Vol. 27, edited by G. H. Bower (Academic Press, Cambridge, MA, 1991) pp. 1–64. [38] G. L. Murphy and D. L. Medin, The role of theories in conceptual coher- ence, Psychological Review 92, 289 (1985). [39] D. N. Osherson and E. E. Smith, On the adequacy of prototype theory as a theory of concepts, Cognition 9, 35 (1981). [40] M. Gehrke, Generalized Kripke frames, Studia Logica 84, 241 (2006). [41] G. Birkho, Lattice Theory (American Mathematical Society, New York, NY, 1940). [42] R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi, Reasoning About Knowledge (MIT Press, Cambridge, MA, 2003). References 199

[43] Ö. Koçak, M. T. Hannan, and G. Hsu, Emergence of market orders: Audience interaction and vanguard inuence, Organization Studies 35, 765 (2014). [44] A. A. Kurz and A. Palmigiano, Epistemic updates on algebras, Logical Methods in Computer Science 9, 1 (2013). [45] M. Ma, A. Palmigiano, and M. Sadrzadeh, Algebraic semantics and model completeness for intuitionistic public announcement logic, Annals of Pure and Applied Logic 165, 963 (2014). [46] W. Conradie, S. Frittella, A. Palmigiano, and A. Tzimoulis, Probabilistic epistemic updates on algebras, in Logic, Rationality, and Interaction, Lecture Notes in Computer Science, Vol. 9394, edited by W. van der Hoek, W. H. Holliday, and W.-F. Wang (Springer, Berlin/Heidelberg, Germany, 2015) pp. 64–76. [47] M. T. Kennedy, Y.-C. Lo, and M. Lounsbury, Category currency: The changing value of conformity as a function of ongoing meaning con- struction, in Categories in Markets: Origins and Evolution, Research in the Sociology of Organizations, Vol. 31, edited by G. Hsu, G. Negro, and Ö. Koçak (Emerald Group, Bingley, United Kingdom, 2010) pp. 369–397. [48] R. Durand and M. Khaire, Where do market categories come from and 6 how? Distinguishing category creation from category emergence, Jour- nal of Management 43, 87 (2017). [49] J.-P. Vergne and T. Wry, Categorizing categorization research: Review, integration, and future directions, Journal of Management Studies 51, 56 (2014). [50] E. H. Rosch, Principles of categorization, in Cognition and Categoriza- tion, edited by E. H. Rosch and B. B. Lloyd (Lawrence Erlbaum Associates, Hillsdale, NJ, 1978) pp. 27–48. [51] M. T. Hannan, G. R. Carroll, and L. Pólos, The organizational niche, Sociological Theory 21, 309 (2003). [52] K. J. Lancaster, The economics of product variety: A survey, Marketing Science 9, 189 (1990). [53] J. R. Anderson, The adaptive nature of human categorization, Psycho- logical Review 98, 409 (1991). [54] E. Konovalova and G. Le Mens, Feature inference with uncertain cat- egorization: Re-assessing Anderson’s rational model, Psychonomic Bulletin & Review, in press (2017). [55] J. A. Hampton, Inheritance of attributes in natural concept conjunctions, Memory & Cognition 15, 55 (1987). [56] J. A. Evans and P. Aceves, Machine translation: Mining text for social theory, Annual Review of Sociology 42, 21 (2016). 200 Conclusions

[57] G. Cattani, J. F. Porac, and H. Thomas, Categories and competition, Strategic Management Journal 38, 64 (2017). [58] B. Loken and J. Ward, Alternative approaches to understanding the determinants of typicality, Journal of Consumer Research 17, 111 (1990). [59] M. Lounsbury and H. Rao, Sources of durability and change in market classications: A study of the reconstitution of product categories in the American mutual fund industry, 1944–1985, Social Forces 82, 969 (2004). [60] H. Rao, P. Monin, and R. Durand, Border crossing: Bricolage and the erosion of categorical boundaries in French gastronomy, American Sociological Review 70, 968 (2005). [61] E. Y. Rhee, J. Y. Lo, M. T. Kennedy, and P. C. Fiss, Things that last? Category creation, imprinting, and durability, in From Categories to Categorization: Studies in Sociology, Organizations, and Strategy at the Crossroads, Research in the Sociology of Organizations, Vol. 51, edited by R. Durand, N. Granqvist, and A. Tyllström (Emerald Group, Bingley, United Kingdom, 2017) pp. 295–325. [62] J. A. Rosa, J. F. Porac, J. Runser-Spanjol, and M. S. Saxon, Sociocognitive 6 dynamics in a product market, Journal of Marketing 63, 64 (1999). [63] M. T. Hannan, L. Pólos, and G. R. Carroll, Logics of Organization Theory: Audiences, Codes, and Ecologies (Princeton University Press, Princeton, NJ, 2007). [64] A. Formica, Ontology-based concept similarity in Formal Concept Anal- ysis, Information Sciences 176, 2624 (2006). [65] U. Priss, Formal Concept Analysis in information science, Annual Review of Information Science and Technology 40, 521 (2006). [66] A. Formica, Concept similarity in Formal Concept Analysis: An informa- tion content approach, Knowledge-Based Systems 21, 80 (2008). [67] L. E. Bourne, Typicality eects in logically dened categories, Memory & Cognition 10, 3 (1982). [68] J. Feldman, Minimization of Boolean complexity in human concept learning, Nature 407, 630 (2000). [69] M. T. Kennedy and P. C. Fiss, An ontological turn in categories research: From standards of legitimacy to evidence of actuality, Journal of Man- agement Studies 50, 1138 (2013). [70] L. Pólos and M. T. Hannan, Reasoning with partial knowledge, Socio- logical Methodology 32, 133 (2002). [71] L. Pólos and M. T. Hannan, A logic for theories in ux: A model-theoretic approach, Logique et Analyse 47, 85 (2004). [72] L. Pólos, M. T. Hannan, and G. Hsu, Modalities in sociological arguments, References 201

Journal of Mathematical Sociology 34, 201 (2010). [73] M. Ruef and K. Patterson, Credit and classication: The impact of indus- try boundaries in nineteenth-century America, Administrative Science Quarterly 54, 486 (2009). [74] A. Barroso, M. S. Giarratana, S. Reis, and O. J. Sorenson, Crowding, satiation, and saturation: The days of television series’ lives, Strategic Management Journal 37, 565 (2016). [75] D. Pavlovic, Quantitative Concept Analysis, https://arxiv.org/abs/1204. 5802 (2012), accessed June 18, 2017. [76] W. Conradie, S. Ghilardi, and A. Palmigiano, Unied correspondence, in Johan van Benthem on Logic and Information Dynamics, Outstanding Contributions to Logic, Vol. 5, edited by A. Baltag and S. Smets (Springer, Berlin/Heidelberg, Germany, 2014) pp. 933–975. [77] M. W. Macy and R. Willer, From factors to actors: Computational soci- ology and agent-based modeling, Annual Review of Sociology 28, 143 (2002). [78] N. M. Wijnberg and G. Gemser, Adding value to innovation: Impres- sionism and the transformation of the selection system in visual arts, Organization Science 11, 323 (2000). 6 [79] M. T. Hannan, On logical formalization of theories from organizational ecology, Sociological Methodology 27, 145 (1997). [80] R. Adner, L. Pólos, M. Ryall, and O. J. Sorenson, The case for formal theory, Academy of Management Review 34, 201 (2009). [81] S. M. Shugan, Save research—Abandon the case method of teaching, Marketing Science 25, 109 (2006). [82] K. Healy, Fuck nuance, Sociological Theory 35, 118 (2017). [83] G. E. P. Box, Science and statistics, Journal of the American Statistical Association 71, 791 (1976).

List of Figures

1.1 Distribution of references by subject category ...... 6

2.1 A two-dimensional product space populated by two rms . . 34 2.2 Spatial consequences of product proliferation ...... 38 2.3 New product introductions in each genre, 2004–2014 . . . . . 46 2.4 Map of Discogs genres computed via Kruskal’s NMDS . . . . . 47 2.5 Yearly number of releases by the major record companies . . 48 2.6 Yearly level of complexity for the ve most complex genres . 50 2.7 Yearly level of demand for the ve most popular genres . . . 51 2.8 Probability function of the dependent variable ...... 53

3.1 A four-dimensional space of binary features ...... 81 3.2 Distribution of the mean rating by number of ratings . . . . . 93 3.3 Frequencies of products in each genre ...... 94 3.4 Distribution of the number of moods by number of styles . . 97 3.5 Eect of spanning in both classication systems ...... 105

4.1 Database, RS-polarity, and perfect lattice ...... 138

203

List of Tables

2.1 Descriptive statistics ...... 54 2.2 Pairwise correlations matrix ...... 55 2.3 Quasi-Poisson model results: Controls ...... 56 2.4 Quasi-Poisson model results: Main eects and interaction . . 57 2.5 Quasi-Poisson model results: Additional analysis ...... 59

3.1 Top style and mood category labels by frequency ...... 95 3.2 Style and mood pairs by Jaccard similarity ...... 96 3.3 Descriptive statistics ...... 100 3.4 Pairwise correlations matrix ...... 101 3.5 Mixed model results: Controls ...... 102 3.6 Mixed model results: Main eects and interaction ...... 103

4.1 Satisfaction and co-satisfaction relations on Í ...... 132 4.2 Standard translation on RS-frames ...... 134

205

Index of Names

Aiken, Leona S., 104 Bugatti Automobiles S.A.S.,3 Akerlof, George A., 79 All Media Network, LLC, 90 Capitol Records, 43 All Music Guide, The, 90 Carroll, Glenn R., xxii AllMusic, 76, 89–93, 100, 102, 107 Caves, Richard E., 32, 60 Amazon.com, Inc., 90 Celestial Emporium of Benevolent Knowl- Ambient 1/Music for Airports, 88 edge,7–11 Apple, Inc., 90 Center for Computer Science in Or- Applied Logic Group, xx,7 ganization and Management, Aristotle, 155 xxii At the Close of a Century, 97 Clarivate Analytics,5 Automobile Dacia S.A.,3 Conradie, Willem, 129 Automobili Lamborghini S.p.A.,3 Deep Forest, 76 Barsalou, Lawrence W., 84, 108 Delft University of Technology, xx,7 Bayus, Barry L., 37, 57, 61 Discogs, 45, 47, 48, 91 Beach Boys, The, 97 Eaton, Jonathan, 30 Beatles, The, 97 EMI Group Limited, 43, 46, 47, 92 Belsley, David A., 54 Eminem, 90 Bertelsmann Music Group, 45, 92 Eno, Brian P. G., 88 Billboard, 43, 45, 46, 50, 51 European Economic Community, 61 Birkho, Garrett, 15, 123, 127, 157, 158, 190 Fagin, Ronald, 173 Black Keys, The,44 Federal Trade Commission, 30 Boheme, 76 Fisons p.l.c., 30 Boole, George, 127, 129, 130, 133 Foucault, Michel,8, 10 Borges, Jorge Luis,7 Future of Music Coalition,44 Bourdieu, Pierre,3 Bowie at the Beeb: The Best of the Galois, Évariste, xii, xvi, 123, 126, 127, BBC Radio Sessions, 1968– 131–133, 142, 157, 158, 164–166, 1972, 97 169, 170, 175 Bowie, David, 97 Ganter, Bernhard, xx, xxiii, 15, 123, 157 Box, George E. P., 195 Gehrke, Mai, 123, 127, 128 Brander, James A., 30 General Foods Corporation, 30

207 208 Index of Names

General Mills, Inc., 30 Napster, 90 Greatest Hits, 1970–2002, The, 97 Netherlands Organization for Scien- Gärdenfors, Peter, xix, xx, xxii, 15, 157 tic Research, xxi Nielsen Holdings PLC, 45 Halpern, Joseph Y., 173 Hannan, Michael T., xxii, xxiii, 14, 122, One, 97 137 Hotelling, Harold, 33 Palmigiano, Alessandra, xxi, 129 Hsu, Greta, 137 Pandora Internet Radio, 90 Pavlovic, Dusko, xxiii Imperial Chemical Industries, 30 Poisson, Siméon D., 52, 54, 56–59, 188 iTunes, 90 PolyGram Group, 92 Porter, Michael E., 32, 60 Jaccard, Paul, 94, 96 Putsis, William P., Jr., 37, 57, 61 Jimi Hendrix Experience, The, 97 Péli, Gábor L., 122 John, Elton H., 97 Pólos, László, xxii, 122 Journal Citation Reports,5 Quaker Oats Company, The, 30 Kamps, Jaap, 122 Karshner, Roger, 43 R. G. Dun & Company, 77 Kellogg Company, 30 Rae, Casey,44 Kennedy, Robert E., 50 Rosch, Eleanor H.,8, 155 Kripke, Saul A., xii, xvii,5, 16, 122, 123, Rovi Corporation, 90 125, 129, 134, 135, 139, 154, 160, Royal Dutch Shell p.l.c., 30 180, 193 Rumelt, Richard P., 11 Kruskal, Joseph B., Jr., 46, 47 Kuh, Edwin, 54 Sahlqvist, Henrik, 122, 123 Schmalensee, Richard, 30, 32 Lancaster, Kelvin J., xi, xv, 33, 36, 40, Shannon, Claude, 49 78 Shaw, R. W., 30 Lind, Jo T., 103, 104 Shazam, 90 Lindembaum, Adolf, 172 Simon, Herbert A., 78 Slacker Radio, 90 Maserati S.p.A.,3 Slim Shady LP, The, 90 Medin, Douglas L., 156 Smith, J. David, 89 Mehlum, Halvor, 103, 104 Sony BMG Music Entertainment, 43 Melara, Robert J., 89 Sony Music Entertainment, 43, 45, 92 Microsoft Corporation, 90 Sounds of Summer: The Very Best of Morrissey, 76 the Beach Boys, 97 Moses, Yoram, 173 Spotify, 90 MusicBrainz, 45, 91 Suzuki Motor Corporation,3 Index of Names 209

Tarski, Alfred, 172 Thompson, James D., 122

United States Patent and Trademark Oce, 77 Universal Music Group, 43, 92 University of Amsterdam, xxii

Vardi, Moshe, 173

Warner Music Group Corp., 43, 92 Welsh, Roy E., 54 West, Stephen G., 104 White Stripes, The,44 Wijnberg, Nachoem M., xxi Wille, Rudolf, xx, xxiii, 15, 123, 157 Winterland, 97 Wonder, Stevie, 97

You Are the Quarry, 76

Zuckerman, Ezra W.,4, 189

Acknowledgements

inishing a Ph.D. can be a lonely and grueling process, but I was lucky to embark on this journey under the wing of two supportive advisors. F Over the past four years, Alessandra and Nachoem have been friends to me as well as mentors, and they have complemented one another to perfection in helping me become an independent researcher. I am grateful to Alessandra for being an endless source of inspiration to me, not only because she is supremely competent, but also because she is implacable in the face of adversity. Prone as I am to having lapses of faith, I may never have completed my dissertation without her sterling example. I am grateful to Nachoem, instead, for being a tirelessly patient teacher. He accorded me a lot of freedom in the pursuit of my research interests, sometimes against his own judgment, yet he always remained available to help me deal with setbacks. I am thankful for his pragmatism, intelligence, and dark sense of humor; most of all, however, I am indebted to Nachoem for his relentless criticism toward my writing and lack of tolerance for poor argumentation. Entering his oce to talk about my research often felt like taking a stroll into a mineeld, but each time I walked away a slightly better thinker. I know clarity is not an easy quality to pass on, and I am extremely proud to have been Nachoem’s apprentice. I extend my deepest gratitude to all the members of the Applied Logic Group at the Delft University of Technology. Apostolos, Fan, Fei, Giuseppe, Sabine, Umberto, and Zhiguang: thank you for introducing me to the beauty of logic. Your patience paid o—before joining you I was barely aware of what formal logic looked like and now I nd it puzzling that the whole realm of social science does not revolve around it. Thank you also for giving me the chance to present some of our work in Guangzhou. I was terried, but I appreciated the challenge. On this note, I would like to thank Minghui Ma for accepting to be on my committee and for being an impeccable host at the Institute of Logic and Cognition of Sun Yat-sen University. Speaking of exotic places, a sizable chunk of this dissertation was written on the terrace of a beach house in KwaZulu-Natal, South Africa, a venue we have grown accustomed to calling the Leisure Bay Institute for Advanced Study. I was fortunate to have this opportunity: not all Ph.D. students get to write up their research while listening to the Indian Ocean and watching

211 212 Acknowledgements monkeys scurry around the lawn. I am forever indebted to Dion and Willem for making this possible, and for oering me shelter in Johannesburg during the dreadful Dutch winters. They are among the kindest, most welcoming people that I have ever met, and I am sincerely honored by their friendship. I am grateful that Willem could make the trip in the reverse direction in order to join my doctoral committee. I would also like to express my gratitude to Andrew Craig, Drew Moshier, Peter Jipsen, and all the other colleagues who partook in our South African adventures, sharing knowledge, jokes, recipes, and trips to national parks. So long, and thanks for all the braai. During my studies, I have had the privilege of meeting many scholars in my eld whom I admire, and whose work I nd outstanding. Most of them were kind enough to help me with their comments, but some went out of their way to be encouraging and point out areas of improvement. In this regard, I am especially grateful to Giacomo Negro, whose research on categories shaped my way of reasoning about this topic and inspired many of the arguments presented in this dissertation. I would like him to know that, in addition to his papers, I admire his degree of self-control, of which he oered demonstration when I met him for the rst time and immediately proceeded to interrogate him. He traveled all the way to Amsterdam in the middle of his holidays only to be met with an endless barrage of questions, but he withstood this ordeal with remarkable poise. For his monumental contribution to the eld and approachable nature, I am very grateful to Mike Hannan, who allowed me to visit him at Stanford and agreed to serve on my committee. Having him on board was perhaps the most rewarding moment of my Ph.D., which, in all fairness, did not really take o until I started reading Mike’s publications. The past four years have taught me that, in academia, good researchers write good papers and tend to be acknowledged for it. The most inventive craft pioneering research programs and, over time, they build a following. Only the best and brightest, however, ever manage to create a school of thought, and this is usually the achievement of their lifetime. Mike is the peculiar kind of scholar who creates a school of thought, outgrows it, and moves on to create another. I suspect that few researchers of this caliber are alive at any given time and I am humbled to have my thesis reviewed by one of them. Many others deserve credit for being kind to me and oering insightful feedback at dierent stages of my research, including Alex van Venrooij, Andrej Srakar, Balázs Kovács, Cameron Verhaal, Christophe van den Bulte, Filippo Wezel, Gaël Le Mens, Greta Hsu, Jerker Denrell, László Pólos, Marco Giarratana, Marilena Vecco, Min Liu, Ming Leung, Martina Montauti, Olav Sorenson, Özgecan Koçak, Stoyan Sgourev, and a number of anonymous Acknowledgements 213 reviewers and unknown conference attendants. I am thankful for the op- portunity I had to learn from such an impressive bunch of people and I sincerely hope to have done justice to their eorts. While working at TU Delft, I had the pleasure of sharing my workplace with delightful colleagues at the Faculty of Technology, Policy, and Manage- ment. Behnam, Diana, Filippo, Georgy, Gert-Jan, Helle, Jelle, Jonas, Maarten, Mark, Nathalie, Peter, Taylor, and Udo: thank you for being such amiable coworkers. Special credit is due to Jan, Philip, Shannon, and Zoë for making me feel welcome since day one. Moreover, I would like to express my utmost gratitude to Jeroen van den Hoven and Neelke Doorn for accepting to be part of my committee, and to Ibo van de Poel for acting as promotor. I hope to have provided for an interesting if unusual read. Having spent a considerable amount of time informally at the University of Amsterdam, I am also grateful to many current or former members of the Entrepreneurship and Innovation section of the Amsterdam Business School. Alex, Balazs, Bram, Frederik, Ieva, Jan, Joris, Monika, Rens, and Yuval: thank you for all the hard questions, the engaging discussions, and the drinks in between. I look forward to working with you in the years to come. Very special thanks are due to Angelo, my “partner in crime” (as he was called once by an Assistant Professor of Accounting), with whom I shared many a moment of uncertainty about our future in academia. On a nal and more personal note, I would like to thank Marcello and Valerio, my lifelong friends and fellow academics, for being a most trusted inner circle. I could not ask for better paranymphs on the day of my defense. I am also deeply thankful for Chiara, who graces me with her support despite the innumerable aws of my character. She is a truly miraculous bundle of brilliance and charm. Further, I am grateful for the support of my parents Anna and Luciano, my brother Saverio, and all the members of my extended family. Finally, I would like to acknowledge the subtle but decisive contribution to this dissertation of my brainstorming companion Monte Carlo, a distinctively aectionate specimen of Pantherophis emoryi who is now terrorizing mice in heaven. His stochastic swirls served as a reminder of all the things in life and research I nd hard to predict.

Curriculum Vitæ

Michele Piazzai

25-12-1988 Born in Avezzano, Italy

Education 2002–2007 Scientic Lyceum

2007–2011 B.A. summa cum laude, Humanities Sierra Nevada College

2011–2012 M.A. cum laude, Cultural Economics and Entrepreneurship Erasmus University Rotterdam

2012 International Summer School Chinese University of Hong Kong

2014 Erasmus Winter Programme Netherlands Institute for Health Sciences

2015 Medici Summer School in Management Studies Bologna Business School, HEC Paris, MIT Sloan

2013–2018 Ph.D., Organization Theory Delft University of Technology

2017– Data Science Specialization Utrecht University

Current Position 2017– Assistant Professor Amsterdam Business School

215

List of Publications

7. M. Piazzai and N. M. Wijnberg, Diversication, proliferation, and rm performance in the U.S. music industry, Academy of Management Pro- ceedings: Best Papers, 16528 (2017). https://doi.org/10.5465/AMBPP. 2017.29 6. W. Conradie, S. Frittella, A. Palmigiano, M. Piazzai, A. Tzimoulis, and N. M. Wijnberg, Toward an epistemic-logical theory of categorization, in Theoretical Aspects of Rationality and Knowledge, Electronic Pro- ceedings in Theoretical Computer Science, Vol. 251, edited by J. Lang (Open Publishing Association, Waterloo, Australia, 2017) pp. 167–186. http://doi.org/10.4204/EPTCS.251.12 5. M. Vecco, A. Srakar, and M. Piazzai, Visitor attitudes toward deacces- sioning in Italian public museums: An econometric analysis, Poetics 63, 33 (2017). http://doi.org/10.1016/j.poetic.2017.05.001 4. M. Kackovic, M. Piazzai, and N. M. Wijnberg, Crossing the threshold and exiting nascency: Antecedents to gaining full-edged legitimacy, Academy of Management Proceedings, 13598 (2016). http://doi.org/10. 5465/AMBPP.2016.13598abstract 3. M. Piazzai and N. M. Wijnberg, Feature-based vs. goal-based categories: The eects of spanning in both systems simultaneously, Academy of Management Proceedings, 12482 (2016). http://doi.org/10.5465/ AMBPP.2016.12482abstract 2. W. Conradie, S. Frittella, A. Palmigiano, M. Piazzai, A. Tzimoulis, and N. M. Wijnberg, Categories: How I learned to stop worrying and love two sorts, in Logic, Language, Information, and Computation, Lecture Notes in Computer Science, Vol. 9803, edited by J. Väänänen, Å. Hirvo- nen, and R. J. G. B. de Queiroz (Springer, Berlin/Heidelberg, Germany, 2016) pp. 45–164. http://doi.org/10.1007/978-3-662-52921-8_10 1. M. Vecco and M. Piazzai, Deaccessioning of museum collections: What do we know and where do we stand in Europe? Journal of Cultural Heritage 15, 221 (2015). http://doi.org/10.1016/j.culher.2014.03.007

217