The A.D.E. Taxonomy of Spreadsheet Application Development
Total Page:16
File Type:pdf, Size:1020Kb
Edith Cowan University Research Online Theses: Doctorates and Masters Theses 1992 The A.D.E. taxonomy of spreadsheet application development Maria Jean Hall Edith Cowan University Follow this and additional works at: https://ro.ecu.edu.au/theses Part of the Software Engineering Commons Recommended Citation Hall, M. J. (1992). The A.D.E. taxonomy of spreadsheet application development. https://ro.ecu.edu.au/ theses/1696 This Thesis is posted at Research Online. https://ro.ecu.edu.au/theses/1696 Edith Cowan University Copyright Warning You may print or download ONE copy of this document for the purpose of your own research or study. The University does not authorize you to copy, communicate or otherwise make available electronically to any other person any copyright material contained on this site. You are reminded of the following: Copyright owners are entitled to take legal action against persons who infringe their copyright. A reproduction of material that is protected by copyright may be a copyright infringement. Where the reproduction of such material is done without attribution of authorship, with false attribution of authorship or the authorship is treated in a derogatory manner, this may be a breach of the author’s moral rights contained in Part IX of the Copyright Act 1968 (Cth). Courts have the power to impose a wide range of civil and criminal sanctions for infringement of copyright, infringement of moral rights and other offences under the Copyright Act 1968 (Cth). Higher penalties may apply, and higher damages may be awarded, for offences and infringements involving the conversion of material into digital or electronic form. USE OF THESIS The Use of Thesis statement is not included in this version of the thesis. THE A.D.E. TAXONOMY OF SPREADSHEET APP~ICATION DEVELOPMENT by Maria Jean Johnstone Hall B.Sc., Grad. Dip. Applied Science Computer Studies A Dissertation submitted in Fulfilment of the Requirements for the Award of Master of Applied Science at the Faculty of Science and Technology, Edith Cowan University Date of Submission: June 1992 ii ABSTRACT Spreadsheets are a major application in end-user computing, one of the fastest growing areas of computing. Studies have shown that 30% of spreadsheet applica tions contain errors. As major decisions are often made with the assistance of spreadsheets, the control of spreadsheet applications is a matter of concern to end user developers, managers, EDP auditors and computer professionals. The application of appropriate controls to the spreadsheet development process requires prior categorisation of the spreadsheet application. The special-purpose A.D.E. (Application, Development, Environment) taxonomy of spreadsheet application development was evolved by mathematical taxonomic methods to cate gorise spreadsheet development projects to facilitate their management and control. Data was collected on a sample of Australian developed spreadsheet applications. The sampled spreadsheets exhibited a very low level of managerial, I.T. department and auditor control. The data was analysed both by hierarchical cluster analysis using average linkage with the Euclidean distance measure, and by partitioned cluster analysis using the kmeans algorithm. The A.D.E. taxonomy of spreadsheet application development was developed in three sections from these analyses, cate gorising: A - the spreadsheet application, D - the developer and E - the develop ment environment. A diagnostic key was developed for each of the three sections. The A.D.E. taxonomy was validated by inter-rater comparison of the same spreadsheet and by two categorisations by the same rater three months apart. The validity of the clusters, used to develop the taxonomy was established and the taxonomy was also validated under a 'usefulness' criterion. A follow-up study to develop a spreadsheet development 'control model' was foreshadowed. lll DECLARATION I certify that this thesis does not incorporate, without acknowledgment, any material previously submitted for a degree or diploma in any institution of higher education; and that to the best of my knowledge and belief, it does not contain any material previously published or written by another person except where due reference is made in the text. Date o o o o o. ~ ~..1.~01.~ ~. 00 0 . .. 0. .. .. iv ACKNOWLEDGMENTS The writer wishes to express her sincere appreciation to those who t. .ave assisted, either as participants or as advisers throughout the duration of this stud). Special thanks are extended to the study's supervisor, Associate Professor Ronald Hartley for his professional advice and encouragement. With his assistance and continual gentle encouragement, this study was completed on schedule. My thanks is extended to Associate Professor Anthony Watson and Dr Jim Millar fc : their guidance in the early days of this study. I also thank the University research consultancy, Associate Professor Phil Clift and Jane A1111Strong for their suggestions. The interest shown by many colleagues, and the donation of their valuable time by survey participants are also gratefully acknowledged. The purchase of statistical software and survey costs were funded through an Edith Cowan University internal research grant and thanks is extended to the University for this support. Last but not least, I thank my husband William and son Andrew for their patience, consideration and support, and my son James for his invaluable contribution as a good listener, and sometime Devil's Advocate, when considering some of the ideas canvassed in this dissertation. v Table of Contents Abstract ii Declaration iii Acknowledgments iv Table of contents v List of figures xi List of tables XV Chapter 1: Introduction . • • • 1 1.1. Chapter overview . 1 1.2. Spreadsheet applications . 1 1.3. Background to the research problem . 2 1.3 .I. The growth in end-user computing . .. .. 2 t .3.2. lbirteen years of spreadsheet software .. ...................... 4 1.3 .3. The use of spreadsheets as an aid to decision making . 6 1.3 .4. Errors in spreadsheet applications ............................ 7 1.3.5. The computer professional's responsibility .......... .. ........ tO 1.3.6. Do all spreadsheets require control? .......... .. .............. II 1.4. Study focus . 14 1.4.1. Primary research goals . .. .. .. .. .. .. .. .. .. .. .. .. .. t 5 1.4.2. Secondary research goals . 15 1.5. Study significance . .. 16 1.6. Study scope and limitations . 17 1. 7. Outline of subsequent chapters of this dissertation . 17 1.8. Summary of this chapter . 18 Chapter 2: Review of the related literature ....... 0 0 19 2.1. Chapter overview . 19 2.2. Literature sources . 19 2.3. Classification as a human endeavour ...................... 20 vi 2.4. Clusters. models and reality .... .... ..... .. .. .... ... .. ... 24 2.5. Mathematical Taxonomy . 28 2.6. Problems and benefits of Cluster Analysis . 31 2.7. Software Engineering taxonomies . ...................... 32 2.8. Selection of spreadsheet attributes for use in cluster analysis . 33 2.9. Categorisations of relevance to the spreadsheet development process . 33 2.9.1 . End-users . 34 2.9.2. Application areas . 36 2.9.3. Application function . 36 2.9.4. Application criticality . 39 2.9.5. Data . 40 2.9.6. Program implementation . 41 2.9.7. Complexity . 42 2.9.8. Software development environments . 44 2.9.9. Spreadslleet categorisations . 46 2.10. Summary of this chapter ................................ 48 Chapter 3: Study Methodology and Design . 49 3.1. Chapter outline ................... .... .. ...... ... .. ... 49 3.2. Framing of the study .................................... 49 3.3. Outline of the research methods ...................... .... SO 3.4. Survey of spreadsheet application development .............. 50 3.4.1. Population . 50 3.4.2. Sanlple . 50 3.4.3. Bias in die sampling procedures .. .. .. .. .. .. .. .. .. .. .. 61 3.4.4. Instrumentation . 63 3.4.5. Pretest I Pilot study . 72 3.4.6. Questionnaire validity and reliability . 73 3.4.7. Submissioo to participants . 73 3.4.8. Follow-up procedures . 75 3.5. Pre-analytical processing of data . 76 vii 3.5.1. Initial data edit ................... ... ....... .... .. .. 76 3.5.2. Data oodiog and verificalion ............ ................. .. 76 3.5.3. SURVEY databate . 77 3.5.4. CONTROLS database . 78 3.5.5. Variable traosfonnatioos . 79 3.5.6. Super-variables . 79 3.5.7. Data structures for entry to statistical analysis .......... ....... 85 3.5.8. Data screening . 87 3.5.9. StaDdardisatiooofdatamatrix ..... ....................... 90 3.5.10. Transposition ofdatamatrix ................... ...... ... 91 3.6. Cluster Analysis . 91 3.6.1. Overview of clustering procedures ..... ................... 91 3.6.2. Agglomerative hierarchical tree clustering . 92 3.6.3. Kmeans clustering algorithm . lOS 3.7. Exploratory data analysis . 106 3.7.1. Clustering nms ..... .. ............................. .. 106 3.7 .2. Criteria i>r usefulness and acceptability of clustering runs . 107 3.7.3 . Interpretation oftbe clustering results . 108 3.8. The A.D.E. taxonomy ................................ 109 3.8.1. Development of the taxonomy ........ .................... 109 3.8.2. Diagnostic key for the A.D.E. taxonomy . 110 3.8.3. Validation of the A.D.E. taxonomy .. .. .. .. .. .. .. .. 110 3. 9. Assumptions and limitations of !his study . 111 3.10. Ethical considerations . 112 3 .II. Summary of this chapter . 113 Chapter 4: Results . 115 4.1. Chapter overview . 115 4.2. The sample . 115 4.2.1. Sample respoases . liS 4.2.2. Data screening . 116 4.2.3. Missing value treatment .... .. ........ ................. 118 4.2.4. Outlier identificatioo and removal ......................... 119 4.2.5. Sample descripti~ statistics . 121 viii 4.3. Clustering runs . 135 4.3.1. Experimeotal nms to select .,.......eters for productioo nms . • . 135 4.3.2. Production nms for the Developer categories of the tax00011\y . 136 4.3.3. Developer cluster profiles . 139 4.3.4. Production ruos for the Application categories of the taxODOilly . 143 4.3.5. Application cluster profiles . ..