Identifying Fashion Trendsetters in Online Social Networks by Advanced Analytics
Rechts- und Wirtschaftswissenschaftliche Fakultät Fachbereich Wirtschafts- und Sozialwissenschaften
Friedrich-Alexander-Universität Erlangen-Nürnberg
zur Erlangung des Doktorgrades Dr. rer. pol.
vorgelegt von Martina Wenzel, M.Sc. aus Roth
Als Dissertation genehmigt von der Rechts- und Wirtschaftswissenschaftlichen Fakultät / vom Fachbereich Wirtschafts- und Sozialwissenschaften der Friedrich-Alexander-Universität Erlangen-Nürnberg
Tag der mündlichen Prüfung: 22.06.2021
Vorsitzende/r des Promotionsorgans: Prof. Dr. Klaus Henselmann
Gutachter/in: Prof. Dr. Freimut Bodendorf
Prof. Dr. Andreas Fürst
Abstract
Abstract
The fashion industry operates in a highly competitive market with an increasing power of consumers regarding fashion trend creation and diffusion induced by the wide usage of online social networking platforms. This forces fashion companies to adapt their methods of trend prediction to meet consumer needs and preferences and to stay competitive. Social networking platforms provide an instrument to share ideas and opinions which allows users to influence others in their behaviors, and therefore, influence the development of trends. The content published on these platforms, thus, is a rich data source for the fashion industry containing information about changing consumer needs and upcoming trends. Fashion companies, however, challenge to benefit from this social media data for trend prediction purposes as they lack the knowledge about the trend-relevant users who publish content that includes information about future trends. This research addresses the challenge of profiting from this valuable data source for trend prediction, especially in the highly competitive fashion industry. It argues that trends are created and diffused by trendsetters and that the content which is shared by these trendsetters in online social networks includes information that enables early trend detection. Due to this, the study seeks to identify trendsetters based on their digital trace which they leave on online social networking platforms and addresses the question of how fashion trendsetters in online social networks can be identified automatically based on social media data. To achieve this goal, a feature framework is created based on literature review and expert interviews which enables the measurement of characteristics of trend-relevant roles based on social media data. Next, a two-step approach is developed which first extracts a topic- relevant sample of users (community) from a huge online social network, and then identifies the online trendsetters within this sample based on a supervised machine learning approach. For its development, a prototypical data analysis is realized based on publicly accessible data from the online social networking platform Instagram. The resulting methodology for the identification of online trendsetters related to a specific topic area consists of a topic-focused community detection approach and a classification model. The analysis of the relevant features for the model’s class decision further reveals insights into online trendsetters’ characteristics in online social networks. The evaluation of the developed methodology shows its transferability to other use cases and validates the trend prediction potential of the identified online trendsetters.
I Abstract
The results of this thesis contribute to research and practice. The insights gained about online trendsetters’ behavioral patterns, characteristics, and the relevant features for their detection in online social networks expand the knowledge about online trendsetters related to the fashion industry, and thus, contribute to the area of trend research and the recently emerging field of fashion informatics. Besides, the insights can be used by companies to identify appropriate marketing partners to influence trends. Furthermore, the developed methodology supports fashion companies with providing a new data source to increase trend prediction quality and facilitates the identification of changing consumer needs and preferences. 1
1 For a German version of this abstract, please refer to Appendix A.1. II Table of content Table of content
List of figures ...... VI
List of tables...... VIII
List of abbreviations ...... IX
1 Introduction ...... 1 1.1 Motivation and research gap ...... 1 1.2 Objectives and research questions ...... 4 1.3 Research design ...... 6 1.4 Structure of the thesis ...... 10
2 Theoretical background and conceptual foundations ...... 13 2.1 Trend diffusion ...... 13 2.1.1 Terms and definition ...... 13 2.1.2 Trend diffusion theories ...... 15 2.1.3 Trendsetters...... 18 2.1.4 Trend diffusion in the fashion industry ...... 23 2.2 Online social networks ...... 25 2.2.1 Online social networking platforms...... 26 2.2.2 Users and communities ...... 30 2.3 Advanced social media analytics ...... 32 2.3.1 Process ...... 32 2.3.2 Text analytics ...... 34 2.3.3 Social network analytics ...... 36 2.3.4 Supervised machine learning ...... 38 2.4 Interim conclusion ...... 39
3 Related work ...... 42 3.1 Overview ...... 42 3.2 Trend detection on social media platforms ...... 42 3.3 Community detection ...... 44 3.3.1 Process initialization ...... 45 3.3.2 Detection methods and considered data ...... 46 3.3.3 Detection process ...... 47 3.4 Identification of influential users on social media platforms ...... 48 3.4.1 Characteristics and measurement ...... 49 3.4.2 Approaches to detect influential spreaders ...... 53 3.4.3 Classification approaches ...... 55 3.5 Validation of insights by experts from the fashion industry ...... 57 III Table of content
3.5.1 Method ...... 58 3.5.2 Data collection and analysis ...... 58 3.5.3 Findings ...... 60 3.6 Interim conclusion ...... 64
4 Conceptual framework to identify online trendsetters ...... 67 4.1 Two-step approach ...... 67 4.2 Topic-focused community detection ...... 68 4.2.1 Detection metrics ...... 69 4.2.2 Detection mechanisms ...... 73 4.2.3 Data layers ...... 77 4.2.4 Iterative process ...... 78 4.3 Trendsetter classification model ...... 81 4.3.1 Labeling concept ...... 82 4.3.2 Feature framework ...... 87 4.4 Interim conclusion ...... 93
5 Identification of online trendsetters by advanced analytics ...... 94 5.1 Use case description ...... 94 5.1.1 Sneaker trends ...... 94 5.1.2 Instagram ...... 95 5.2 Topic-focused community detection ...... 97 5.2.1 Methods ...... 97 5.2.2 Process initialization and iterations ...... 104 5.2.3 Results and evaluation ...... 107 5.3 Trendsetter classification model ...... 111 5.3.1 Methods ...... 111 5.3.2 Exploratory data analysis and data pre-processing ...... 128 5.3.3 Development and selection of models ...... 132 5.3.4 Comparison of models ...... 135 5.3.5 Relevant features for online trendsetter identification...... 136 5.4 Summary ...... 152
6 Model application ...... 155 6.1 Use case description ...... 155 6.2 Application of trendsetter identification methodology ...... 156 6.3 Trend prediction capability of online trendsetters ...... 158 6.4 Application areas and transferability ...... 167
7 Conclusion ...... 170 7.1 Summary ...... 170
IV Table of content
7.2 Contribution to theory and practice...... 173 7.3 Limitations and implications for future research ...... 174
References ...... IX
Appendix ...... XXVIII A.1 Abstract (German version) ...... XXVIII A.2 Features for the measurement of opinion leaders’ characteristics ...... XXX A.3 Interview guide...... XXXIII A.4 List of features for the identification of online trendsetters ...... XXXV A.5 List of hyperparameters’ search space and final model settings ...... XLI A.6 SHAP dependence plots ...... XLIII
V List of figures
List of figures
Figure 1-1 Design Science Research process model ...... 9 Figure 1-2 Research design ...... 10 Figure 2-1 Fashion lifecycle ...... 14 Figure 2-2 Fashion lifecycle related to Rogers’ adopter categories ...... 17 Figure 2-3 Fashion trendsetters’ characteristics ...... 22 Figure 2-4 Fashion trend creation and diffusion process ...... 24 Figure 2-5 Main components of online social networking platforms and related data .. 27 Figure 2-6 SMA process ...... 33 Figure 2-7 Data from online social networking platforms and applied SMA methods .. 34 Figure 2-8 Application of text analytics within the thesis ...... 35 Figure 2-9 Application of social network analytics within the thesis ...... 37 Figure 2-10 Steps of building a predictive model ...... 39 Figure 3-1 Summary of results – literature review on community detection ...... 48 Figure 3-2 Summary of results – literature review on online opinion leader detection . 55 Figure 3-3 Social media influencer identification process ...... 63 Figure 4-1 Two-step OTS identification approach ...... 68 Figure 4-2 Overview community detection process ...... 69 Figure 4-3 Pseudocode text score ...... 73 Figure 4-4 Pseudocode of consensus algorithm...... 77 Figure 4-5 Community detection process chart ...... 80 Figure 4-6 Applied supervised machine learning process ...... 82 Figure 4-7 Labeling concept based on Rogers’ diffusion model ...... 83 Figure 4-8 Labeling process...... 84 Figure 4-9 Identification of trends – price evolution ...... 86 Figure 4-10 Feature framework ...... 88 Figure 4-11 TSIM concept overview ...... 93 Figure 5-1 Community detection – required data and applied SMA methods ...... 98 Figure 5-2 Matrix decomposition and notations of NMF algorithm ...... 103 Figure 5-3 Score value calculation, example: CPP ...... 106 Figure 5-4 Evolution of user composition ...... 107 Figure 5-5 Community statistics – sneaker community ...... 108 Figure 5-6 Comparison of centrality measures in the initial and final iteration ...... 109 Figure 5-7 Topic models of all community members’ posting texts (sneaker) ...... 110 Figure 5-8 Feature extraction – required data and applied SMA methods ...... 112 Figure 5-9 Model development – applied methods and implementation ...... 115 Figure 5-10 Data splitting – training, validation and test split ...... 115 Figure 5-11 5-fold cross-validation ...... 116 Figure 5-12 Applied feature selection methods ...... 125 Figure 5-13 Confusion matrix ...... 127 Figure 5-14 Community evolution – OTS and non-OTS (sneaker) ...... 130
VI List of figures
Figure 5-15 Visualization of value ranges of the features avg. time between posts, followed by, distinct emoji in bio ...... 131 Figure 5-16 Visualization of PCA analysis ...... 132 Figure 5-17 Process of model training, validation and evaluation ...... 134 Figure 5-18 Overview experimental setup ...... 134 Figure 5-19 Feature importance – SHAP summary plot ...... 138 Figure 5-20 SHAP dependence plot – no. of tags ...... 141 Figure 5-21 Value distribution – no. of (distinct) tags...... 142 Figure 5-22 SHAP dependence plot – ratio follows-followers ...... 143 Figure 5-23 Value distribution – ratio follows-followers ...... 143 Figure 5-24 Value distribution – no. of hashtags and posts ...... 144 Figure 5-25 SHAP dependence plot – no. of distinct hashtags...... 145 Figure 5-26 Value distribution – no. of distinct hashtags and avg. no. of videos...... 145 Figure 5-27 SHAP dependence plot – avg. no. of videos ...... 146 Figure 5-28 SHAP dependence plot – ratio hashtags bio ...... 147 Figure 6-1 Topic models of all community members’ posting texts (sustainability) ... 157 Figure 6-2 Community evolution – OTS and non-OTS (sustainability) ...... 158 Figure 6-3 Steps to assess the trend prediction potential of identified OTSs ...... 159 Figure 6-4 Topic evolution in OTSs’ postings (2016-2019) ...... 161 Figure 6-5 Example – evolution trend hashtag popularity ...... 162 Figure 6-6 Example – popularity evolution of reuse ...... 163 Figure 6-7 Example – popularity evolution of reuse after differencing ...... 164
VII List of tables List of tables
Table 1-1 Research questions ...... 5 Table 2-1 Social networking platform – components, functions and information ...... 30 Table 3-1 Feature examples for the measurement of opinion leaders’ characteristics ... 53 Table 3-2 Overview of interviewees ...... 59 Table 3-3 Most mentioned criteria of social media influencer selection ...... 63 Table 4-1 Filter criteria ...... 74 Table 4-2 Scoring criteria ...... 75 Table 4-3 Overview different data layers ...... 78 Table 4-4 Examples of user-related features ...... 89 Table 4-5 Examples of content-related features ...... 90 Table 4-6 Examples of context-related features ...... 91 Table 4-7 Examples of network-related features ...... 92 Table 5-1 Instagram functions, related data, and data access ...... 96 Table 5-2 Relevant concepts of applied methods and use case related examples ...... 99 Table 5-3 Use case-specific settings – variables and values (sneaker) ...... 105 Table 5-4 Relevant machine learning terminology ...... 113 Table 5-5 Performance measures ...... 127 Table 5-6 Comparison OTS and non-OTS (sneaker community) ...... 129 Table 5-7 Value ranges of the features avg. time between posts, followed by, distinct emoji in bio ...... 131 Table 5-8 Performance results of the best model of each classifier ...... 136 Table 5-9 Important features for OTS detection according to the SHAP value ...... 140 Table 5-10 Summary of results – feature importance analysis ...... 151 Table 6-1 Use case-specific settings – variables and values (sustainability) ...... 156 Table 6-2 Comparison OTS and non-OTS (sustainability community) ...... 157 Table 6-3 Consumer trends related to the sustainability megatrend in 2020 ...... 160 Table 6-4 Granger causality test results ...... 165 Table 6-5 Comparison of OTSs and users with the highest reach (Granger causality) 166 Table 6-6 Transferability of approach to other social media platforms ...... 169
VIII List of abbreviations List of abbreviations
API Application Programming Interfaces Avg. Average BOW Bag-Of-Words BS Biography Score CPP Comments per Post FFR Followers-follows ratio FN False negative FP False positive HTML Hypertext Markup Language IE Information Extraction IS Information Systems LDA Latent Dirichlet Allocation LPP Likes per Post MDA Mean Decrease Accuracy NLP Natural Language Processing NMF Non-negative Matrix Factorization No. Number OSN Online Social Network OTS Online Trendsetter PCA Principal Component Analysis POS Part-of-Speech PPD Number of Posts per Day RQ Research Question RSS Rich Site Summary SD Standard Deviation SHAP SHapley Additive exPlanations SMA Social Media Analytics SMOTE Synthetic Minority Oversampling Technique SNA Social Network Analytics SVM Support Vector Machine TF-IDF Term Frequency-Inverse Document Frequency
IX List of abbreviations
TN True negative TP True positive TS Text Score TSIM Trendsetter Identification Methodology UGC User-Generated Content URL Uniform Resource Locator VSM Vector Space Models
X Introduction 1 Introduction
1.1 Motivation and research gap
The digitalization and the increasing interconnectedness, due to the emergence of the internet and especially the Web 2.0, has influenced and changed many areas of society during the last years. Particularly the fashion industry is changing as consumers steadily ask for new styles and products these days (McNeill and Moore, 2015). To face this new demand, the fashion business moves from the traditional bi-seasonal fashion collections towards so- called fast fashion with constantly changing, new intermediate collections (Bhardwaj and Fairhurst, 2010; Kim et al., 2011). This transition leads to shorter product life cycles with a decreasing timespan between a new design release and consumption (Kim et al., 2011), and forces companies to generate new ideas and innovative products faster than ever (Bhardwaj and Fairhurst, 2010). Besides this new pace, the growing usage of online social networks platforms increases the power of consumers regarding trend creation and diffusion (Dillon, 2012; Chang et al., 2015) as such platforms support the generation and exchange of content across borders (Kaplan and Haenlein, 2010). Internet users transform from bare consumers to active creators of this so-called user-generated content (Beheshti-Kashi et al., 2015; Schiele et al., 2008), and at the same time, these online social networks (OSNs) take a key role in information diffusion (Guille et al., 2013). Some of this content becomes popular and even contributes to new trends (Guille et al., 2013). Due to OSNs, ordinary people have a platform to become visible for anyone online, share their ideas, opinions and preferences, reach a mass audience, and influence others in their decisions. This shows that “trendsetting” is no longer restricted to traditional actors like designers, fashion companies, forecasting agencies and fashion magazines, but trends are more and more influenced by active users of social media platforms (Jackson, 2007; Manikonda et al., 2016). Some of these social media users are highly interconnected and perceived by their community as experts in a specific field. They influence the attitudes and behavior of their audience by the content they spread via social media (Freberg et al., 2011). Therefore, this published content is a valuable data source regarding trend detection (Abdullah and Wu, 2011) and can serve as an inspiration source for new product development and marketing campaigns (Tsur and Rappoport, 2012). As the ability to meet consumers’ preferences and the speed of responsiveness highly influence the profitability and the competitiveness of fashion retailers nowadays (Bhardwaj and Fairhurst, 2010), the content which is published by these new trend-relevant users in
1 Introduction
OSNs, in the following called online trendsetters (OTSs), can support fashion companies to stay competitive. In this thesis, OTSs are defined as users of online social networking platforms that adopt and diffuse new ideas before these ideas become popular (Rogers, 2003; Saez-Trumper et al., 2012). This definition bases on Rogers’ innovation diffusion theory and is adapted to the online space. OTSs can be traditional trend actors who are active in OSNs or the new above-mentioned ordinary people who become important regarding trends due to their activity and position in OSNs.
Fashion companies have already recognized the relevance of OSNs in the context of trend creation and diffusion. This is underlined by the increasing spendings on social media initiatives and related functions within the companies (Roberts and Piller, 2016). Thus, fashion companies nowadays use OSNs in different ways to support product creation and to influence trends. In this context, social media influencers and the related marketing branch of influencer marketing gained more and more attention during the last decade (Audrezet et al., 2020). Social media influencers are often referred to as opinion leaders on social media platforms who engage in electronic word-of-mouth to spread information (Lou and Yuan, 2019). Especially in the OSN Instagram, they are successful message spreaders who have an impact on creating and diffusing new trends and push sales (Jin et al., 2019). Therefore, marketing departments try to identify these influential users to collaborate with them to shape their brand image, push products in the market, and influence fashion trends in a specific domain or area (Jin et al., 2019). More recently, fashion companies have also started to co-create new products with social media influencers to benefit from their knowledge about consumer preferences and their closeness to the target group (Ahmad et al., 2015).
The challenge of practitioners to profit from the value of OSNs for their business is the identification of relevant users and data regarding new trends out of the huge and noisy data source. There are 3.5 billion active social media users worldwide, which is 42% of the total global population (Hootsuite & We are social, 2019). Instagram, which is one of the most relevant OSN related to fashion today (Casaló et al., 2020; Phua et al., 2017) has one billion monthly active users, 95 million daily postings and 4.2 billion likes per day (Hootsuite & We are social, 2019). Taking into account this huge data volume, it is challenging to find the right fraction of users and the respective data with relevant information about upcoming trends manually. There is also no clear understanding of which users have the potential of creating and influencing the development of new trends. The selection of online opinion leaders for marketing cooperation, for instance, is mainly done by a limited number of quantitative measures such as the number of followers, and mostly rely on the assumption 2 Introduction that the larger the audience of a user, the larger her or his impact on the network (De Veirman et al., 2017; Rakoczy et al., 2018). Although it is known from offline studies that the trend- relevant opinion leaders are perceived as trustworthy, likable, and having expertise in a specific field (Rogers, 2003; Lazarsfeld et al., 1944; Chan and Misra, 1990), such qualitative criteria are neglected by existing automated selection approaches (Freberg et al., 2010; De Veirman et al., 2017). One reason for this is the missing knowledge about how to measure such characteristics based on the data provided by online social networking platforms. Therefore, companies mostly consider users with a huge network as trend-relevant users, although studies have shown that users with a small community often have more impact on their followers (Kay et al., 2020).
From the academic perspective, a lot of research already exists which deals with trend detection and the identification of influential spreaders (Guille et al., 2013). Studies, which focus on trend-relevant users regarding early adoption in a fashion context, however, are rare (e.g., Bakshy et al., 2009; Saez-Trumper et al., 2012; Cervellini et al., 2016). Cervellini et al. (2016), for instance, test an algorithm that combines a network topology approach with a temporal analysis to identify trendsetters in the OSN Yelp. Similar to most of the research in this field, this algorithm identifies trendsetters related to a specific trend and by considering the time of adoption (e.g., Cervellini et al., 2016; Saez-Trumper et al., 2012). These studies do not investigate specific behavioral patterns or characteristics of the trendsetter group based on extensive analysis of their digital trace in OSNs. Their detection by recognizing such “trendsetter-patterns” based on social media data from OSNs, however, allows their identification without the knowledge about an already existing trend, and therefore, enables the detection of early signals of trends before they become a trend. Although there is a growing body of studies that analyze different types of opinion leaders and influential spreaders in OSNs, little research focuses on fashion-related actors. Only a small number of studies apply data-intensive computational approaches within the field of fashion (e.g., Chen and Luo, 2017; Park et al., 2016; Lin et al., 2015; Lin et al., 2014). In most recent years the notion of fashion informatics emerged which describes the usage of computational methods in the context of the fashion industry to profit from rich data sources such as social media data. At the same time, there is a call for more research in this field (Copeland et al., 2019; Zhao and Min, 2019). Zhao and Min (2019) underline the importance of social media data in the field of fashion research as this industry has become a social- media-driven one. They highlight the value of social media data for fashion research, especially in terms of actuality, accessibility, volume and low cost. Lee et al. (2017)
3 Introduction furthermore emphasize the importance of dealing with the identification of “key personnels that generate significant changes (e.g. trend-setting)” in a fashion-related network as a baseline to improve fashion trend prediction (Lee et al., 2017, p. 4). As recent studies, which investigate the detection of emerging trends and fads, do not consider the role of trendsetters in OSNs, they ask for future research focusing on fashion trendsetters especially their online actions and motivations (Lee et al., 2017).
To sum up, the identified gaps in research and practice are the following:
(1) Lack of knowledge about OTSs and their behavioral patterns and personality traits based on data from OSNs related to the fashion industry (2) Missing approach to detect potential OTSs based on extensive features derived from social media data which allow their automated detection and which consider quantitative and qualitative criteria
The dissertation addresses these issues and aims to contribute to research and practice by analyzing fashion trend diffusion and trend-relevant actors in the online space to develop a methodology to identify OTSs in online social networks. By providing companies with information about who will create and shape trends in the online world, they can improve trend prediction and the selection of appropriate partners for marketing cooperations and co- creation (Lee et al. 2017). Besides, this dissertation aims to narrow the research gap regarding trend-relevant user groups in a fashion context, especially regarding their traits and behavioral patterns based on social media data as well as their embeddedness in the fashion-based network. It contributes to the new research field of fashion informatics by providing an approach for trendsetter identification by using data-intensive computational methods such as social network analysis, text mining, and machine learning algorithms.
1.2 Objectives and research questions
The overall goal of this thesis is to develop an approach for trendsetter identification in online social networks related to a specific fashion area by answering the following questions: (1) Who are the trend-relevant users in OSN,
(2) what characterizes them, and
(3) how to identify them based on data from OSNs to support trend detection and trend influencing?
4 Introduction
Therefore, the main research question (RQ) of this thesis is the following:
How can fashion trendsetters in OSNs be identified automatically based on social media data? Based on this question, several sub-questions arise which are related to the construction of a conceptual framework (RQ 1.1 - 1.3), the development of the previously conceptualized solution (RQ 2.1 - 2.3) and its evaluation (RQ 3.1 - 3.2). Table 1-1 gives an overview of these sub-questions related to the three areas.
1 - Concept
RQ 1.1: Which trend-relevant roles and actors do exist, related to the fashion industry, and how can they be characterized?
RQ 1.2: Which features based on social media data from OSNs can be used to measure these characteristics and which data is required for the calculation of these features?
RQ 1.3: Which methods enable the identification of specific user roles in online social networks according to existing literature?
2 - Analysis
RQ 2.1: How can a community related to a specific interest field be identified in OSNs and based on which data?
RQ 2.2: Which classifiers are most suitable for fashion trendsetters identification with regards to state-of-the-art performance measures using social media data?
RQ 2.3: Which features are relevant for the identification of fashion trendsetters according to the analysis?
3 - Evaluation
RQ 3.1: How does the developed methodology perform on a specific use case compared to existing methods?
RQ 3.2: How can it be used in practice?
Table 1-1 Research questions
RQ 1 addresses the theoretical fundament of the research project and aims to build a conceptual framework for the identification of fashion trendsetters in online social networks. 5 Introduction
For this purpose, existing knowledge about trend diffusion, trend-relevant roles, and actors as well as about trend- and fashion-relevant user groups of social media platforms is gathered and analyzed with regard to important elements and insights for the construction of the framework. Based on the insights gained within RQ1, the second RQ deals with the realization of the concept and the development of the methodology. Therefore, a process for data collection is derived which results in a topic-focused community. Subsequently, a classification model is developed which aims to classify the members of the identified community in OTS and non-OTS accounts. Additionally, the relevant features for the respective class decision are investigated. RQ 3 subsequently addresses the evaluation of the developed solution and shows its application and usage in practice.
1.3 Research design
The research project follows the research paradigm of Design Science Research by Hevner (2004), which is often applied in the field of Information Systems (IS) (Peffers et al. 2012). He developed a conceptual framework that combines behavioral science and design science, to understand, execute and evaluate IS research. He states that the problem space is defined by the environment of the area of interest, whereas the solution for this problem and the creation of an artifact refers to a specific knowledge base. After creating a new artifact, it is tested and evaluated in realistic conditions within this environment, and the gained insights are added to the initial knowledge base. In this research project, the newly emerging OTSs and their increasing power regarding trend creation and diffusion bring up the business need of identifying them online, and gain insights about their behaviors and characteristics based on social media data. The basic knowledge to solve this problem consists of insights from trend research, research on social media platforms and their user groups as well as on techniques from social media analytics and machine learning. This thesis adds a methodology to detect OTSs in OSNs based on social media data and reveals knowledge about their online behavioral patterns and characteristics.
Hevner et al. (2004) define seven guidelines to achieve effective design-science research. The following section gives an overview of the dissertation project and aims to emphasize how these guidelines are implemented in the thesis:
1. Creation of a viable artifact The core artifact of this thesis is a methodology to identify OTSs in a fashion context based on social media data, which consist of two components:
6 Introduction
(1) An approach to detect a topic-focused community with means of social media analytics based on social media data (2) An explainable classification model which classifies a community in OTS and non-OTS
The process and algorithms, that allow extracting a community from an OSN, classifying them into OTS and non-OTS as well as the knowledge about the relevant features are the artifacts of this research. As an example, the classification model as artifact consists of the classification algorithm, the selection of features that are integrated into the model, and the hyperparameter settings. In the following, this two- step methodology is referred to as Trendsetter Identification Methodology (TSIM).
2. Development of a solution for a business problem The importance of the problem is emphasized by the increasing power of users in OSNs to create and diffuse trends combined with the high number of active users and the missing capabilities of the fashion industry to profit from this huge new data source which potentially includes signals about new trends (Lee et al., 2017). Therefore, fashion companies struggle to keep pace and lose their competitiveness. The relevance of the problem is underlined by the increasing number of studies dealing with trend detection on social media platforms and the related research string on identifying and analyzing influential users of information cascades. The newly emerging research field of fashion informatics also shows the relevance of research projects which apply computational methods to solve problems related to the fashion industry (Copeland et al., 2019).
3. Evaluation of the solution by well-executed methods The evaluation activities take place in several phases of the research project by applying evaluation methods such as technical experiments, logical argument, illustrative scenario method and expert evaluation, which are according to Peffers appropriate methods to evaluate the respective artifact (Peffers et al., 2012).
4. Contribution of research The dissertation contributes to both theory and practice. It expands the knowledge about fashion OTSs by providing insights into their online behavioral patterns and their detection based on social media data. Based on an empirical analysis, it reveals insights about features derived from social media data which increase the accuracy of the classification model, and therefore, can be considered as relevant criteria for
7 Introduction
the identification of OTSs. It also contributes knowledge about the potentially important features for such a classification problem by transferring characteristics of trendsetters based on traditional trend theories to measurable metrics. From a practitioner's perspective, this thesis helps to identify potential OTSs as it provides a methodology for their identification in OSNs, and subsequently supports the prediction of new trends.
5. Application of rigorous methods in construction and evaluation of the artifact Research on trends and trend-relevant groups often refers to Rogers’ innovation diffusion model and social theories such as social network theory and the social capital theory. The research design of this thesis bases on previous research related to those theories. The conceptual framework builds on Rogers’ diffusion model which provides relevant elements for the identification of OTSs. The theory delivers the fundamental definitions of trend-relevant groups and related concepts. For the identification of potentially relevant features and the creation of the feature framework, studies that base on social capital theories are considered as well as insights from studies referring to the social network theory. The latter also builds the baseline for the community detection approach. Thus, this dissertation bases on a clearly defined and tested fundament of literature and knowledge of the relevant research areas. Additionally, Figure 1-2 shows that rigorous methods of Design Science Research are applied in all phases of research, for the design as well as for the development and evaluation of artifacts.
6. Utilization of available means for searching effective artifacts The design science process aims to identify OTSs without relating to a specific trend, and therefore, design methods have to be identified which satisfy this purpose. Therefore, characteristics of trendsetters are identified by conducting extensive literature reviews in the field of trend diffusion, trend actors, and influential spreaders on social media platforms. Additionally, expert interviews are carried out to further enrich the insights by the practitioner's perspective. Besides, research in the field of social media analytics and machine learning is conducted to find appropriate methods for the design of the solution. During the design phase, several evaluation and improvement loops are conducted and different variants are compared. For the development of the classification model, for instance, four different machine learning algorithms and three ensemble learning methods are compared according to their classification accuracy, and the best model is selected for the TSIM. 8 Introduction
7. Communication of research The communication of results and insights is realized by their documentation in this thesis. The targeted audience is the research community that is familiar with trend detection and influential user groups as well as the community around the newly emerging research area of fashion informatics. The thesis also includes useful information for practitioners in the field of product development, marketing and strategy. The TSIM assists companies with the identification of potential OTS who can be then monitored to detect new innovative ideas and emerging trends early on. This knowledge can support companies in their strategic decisions regarding product development and marketing. The method also can be used by marketing departments to find potential partners to influence future trends or provide designers with a data source for new product inspirations.
To realize the above-described research project, the dissertation follows a process model for design science in IS research provided by Peffers et al. (2008) which is illustrated in Figure 1-1.
Problem Definition of Design & identification and objectives of a Demonstration Evaluation Communication Development motivation solution