Finding high{quality grey literature for use as evidence in software engineering research A thesis submitted in partial fulfilment of the requirements for the Degree of Doctor of Philosophy in the University of Canterbury by Ashley Williams University of Canterbury 2019 To my Grandad, Norman For encouraging me to push myself; and ensuring that I never miss a family Christmas Abstract Background: Software engineering research often uses practitioners as a source of evidence in their studies. This evidence is usually gathered through empirical methods such as surveys, interviews and ethnographic research. The web has brought with it the emergence of the social programmer. Soft- ware practitioners are publishing their opinions online through blog articles, discussion boards and Q&A sites. Mining these online sources of informa- tion could provide a new source of evidence which complements traditional evidence sources. There are benefits to the adoption of grey literature in software engi- neering research (such as bridging the gap between the state{of{art where research typically operates and the state{of{practice), but also significant challenges. The main challenge is finding grey literature which is of high{ quality to the researcher given the vast volume of grey literature available on the web. The thesis defines the quality of grey literature in terms of its relevance to the research being undertaken and its credibility. The thesis also focuses on a particular type of grey literature that has been written by soft- ware practitioners. A typical example of such grey literature is blog articles, which are specifically used as examples throughout the thesis. Objectives: There are two main objectives to the thesis; to investigate the problems of finding high{quality grey literature, and to make progress in addressing those problems. In working towards these objectives, we inves- tigate our main research question, how can researchers more effectively and efficiently search for and then select the higher{quality blog{like content rel- evant to their research? We divide this question into twelve sub{questions, and more formally define what we mean by `blog{like content.' Method: To achieve the objectives, we first investigate how software en- gineering researchers define and assess quality when working with grey lit- erature; and then work towards a methodology and also a tool{suite which can semi{automate the identification and the quality assessment of relevant grey literature for use as evidence in the researchers study. To investigate how software engineering researchers define and assess quality, we first conduct a literature review of credibility assessment to gather a set of credibility criteria. We then validate those criteria through a sur- vey of software engineering researchers. This gives us an overall model of credibility assessment within software engineering research. We next investigate the empirical challenges of measuring quality and de- velop a methodology which has been adapted from the case survey method- ology and aims to address the problems and challenges identified. Along with the methodology is a suggested tool{suite which is intended to help re- searchers in automating the application of a subset of the credibility model. The tool{suite developed supports the methodology by, for example, au- tomating tasks in order to scale the analysis. The use of the methodology and tool{suite is then demonstrated through three examples. These examples include a partial evaluation of the methodology and tool{suite. Results: Our literature review of credibility assessment identified a set of criteria that have been used in previous research. However, we also found a lack of definitions for both the criteria and, more generally, the term credibil- ity. Credibility assessment is a difficult and subjective task that is particular to each individual. Research has addressed this subjectivity by conducting studies that look at how particular user groups assess credibility e.g. pension- ers, university students, the visually impaired, however none of the studies reviewed software engineering researchers. Informed by the literature review, we conducted a survey which we believe is the first study on the credibility assessment of software engineering researchers. The results of the survey are a more refined set of criteria, but also a set that many (approximately 60%) of the survey participants believed generalise to other types of media (both practitioner{generated and researcher{generated). We found that there are significant challenges in using blog{like content as evidence in research. For example, there are the challenges of identifying the high{quality content from the vast quantity available on the web, and then creating methods of analysis which are scalable to handle that vast quantity. In addressing these challenges, we produce: a set of heuristics which can help in finding higher{quality results when searching using traditional search engines, a validated list of reasoning markers that can aid in assessing the amount of reasoning within a document, a review of the current state of the experience mining domain, and a modifiable classification schema for classifying the source of URLs. With credibility assessment being such a subjective task, there can be no one–size–fits–all method to automating quality assessment. Instead, our methodology is intended to be used as a framework in which the researcher using it can swap out and adapt the criteria that we assess for their own criteria based on the context of the study being undertaken and the personal preference of the researcher. We find from the survey that there are a variety of attitude's towards using grey literature in software engineering research and not all respondents view the use of grey literature as evidence in the way that we do (i.e. as having the same benefits and threats as other traditional methods of evidence gathering). Conclusion: The work presented in this thesis makes significant progress towards answering our research question and the thesis provides a foundation for future research on automated quality assessment and credibility. Adop- tion of the tools and methodology presented in this thesis can help more ef- fectively and efficiently search for and select higher{quality blog{like content, but there is a need for more substantial research on the credibility assessment of software engineering researchers, and a more extensive credibility model to be produced. This can be achieved through replicating the literature review systematically, accepting more studies for analysis, and by conducting a more extensive survey with a greater number, and more representative selection, of survey respondents. With a more robust credibility model, we can have more confidence in the criteria that we choose to include within the methodology and tools, as well as automating the assessment of more criteria. Throughout the re- search, there has been a challenge in aggregating the results after assessing each criterion. Future research should look towards the adoption of machine learning methods to aid with this aggregation. We believe that the criteria and measures used by our tools can serve as features to machine learning classifiers which will be able to more accurately assess quality. However, be- fore such work is to take place, there is a need for annotated data{sets to be developed. Table of Contents List of Tables v List of Figures x I Foundations 1 Chapter 1: Introduction 2 1.1 Context . .2 1.2 Problem statement and motivation...............6 1.3 Definitions.............................6 1.4 Aims of the thesis . .7 1.5 Research questions . .8 1.6 Contributions . 10 1.7 Structure of this thesis . 11 Chapter 2: Background 14 2.1 Introduction . 14 2.2 Definitions . 15 2.3 Evidence in software engineering research and software engi- neering practice . 20 2.4 Systematic reviews in software engineering . 24 2.5 The social programmer and grey literature . 26 2.6 Systematic reviews that incorporate grey literature . 27 2.7 `Blog{like' content as a type of grey literature . 28 2.8 The benefits of looking at blog{like content . 34 2.9 The challenges of looking at blog{like content . 38 2.10 Addressing the research questions................ 40 II Credibility assessment for finding higher{quality grey literature 42 Chapter 3: Creating a candidate list of conceptual credibility criteria for software engineering research 43 3.1 Introduction . 43 3.2 Rationale for the structured review . 45 3.3 Methodology . 46 3.4 Results . 52 3.5 Discussion . 65 Chapter 4: Refining and validating our candidate list of cred- ibility criteria with a survey of researchers 72 4.1 Introduction . 72 4.2 Methodology . 73 4.3 Quantitative results . 80 4.4 Qualitative results . 90 4.5 Discussion . 92 Chapter 5: Developing a model of credibility assessment in software engineering research 97 5.1 Introduction . 97 5.2 Developing the model . 100 5.3 Towards automatic measurement of credibility using the model 104 5.4 Addressing the research questions................ 105 III A methodology for finding high{quality content 107 Chapter 6: Preliminary empirical investigations 108 6.1 Introduction . 108 6.2 Pilot study 1: Identifying practitioners' arguments and evi- dence in blogs . 109 6.3 Pilot study 2: Toward the use of blog articles as a source of evidence . 119 ii 6.4 Summary of lessons learned from both pilot studies . 124 6.5 Addressing the research questions................ 125 Chapter 7: Incorporating the credibility criteria into the case survey methodology 127 7.1 Introduction . 127 7.2 The case survey methodology . 128 7.3 Structure and logic of the credibility criteria . 129 7.4 Our search heuristics . 135 7.5 Post{search measures for credibility assessment criteria . 141 7.6 Tooling . 142 7.7 Addressing the research questions................ 145 Chapter 8: Developing measures for the credibility criteria 146 8.1 Introduction . 146 8.2 Generating a set of validated reasoning markers . 147 8.3 Review of methods for experience mining .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages329 Page
-
File Size-