Using Treemaps As a Predictive Indicator of Project Cost Overruns

Total Page:16

File Type:pdf, Size:1020Kb

Using Treemaps As a Predictive Indicator of Project Cost Overruns USING TREEMAPS AS A PREDICTIVE INDICATOR OF PROJECT COST OVERRUNS Trefor P. Williams [email protected] Department of Civil and Environmental Engineering, Rutgers University, 623 Bowser Road, Piscataway, New Jersey 08854, USA ABSTRACT Treemaps are a method of data visualization that allows complex data sets to be studied without resorting to complex statistical procedures. Treemaps were applied to bidding data to study the relationship betting bidding ratios and cost increases on highway projects constructed in the states of Texas and California. The bidding ratios were used to identify the nature of submitted bids by measuring the spread of the bids, and the existence of outlier bids. The treemaps indicated that projects with high ratio values typically experienced a larger weighted average percentage difference between the low bid and completed project cost than projects with low ratio values when the weighting factor was the magnitude of the project low bid. The treemap analysis also indicated that increasing numbers of bidders also affects the tendency for project costs to increase. Keywords: bidding, data visualization, highway projects 1. INTRODUCTION Data visualization is an emerging technology that can allow users to more easily discern relationships in complex data sets. One data visualization technique is treemaps. Treemaps can be extremely useful in the construction industry because development of treemaps and their understanding does not require extensive statistical knowledge. Busy construction contractors may not have the time or knowledge to construct regression models to define relationships between cost data. With treemaps they can more easily visualize relationships between the data. Treemap software is now available with an easily understandable user interface that allows users to visually analyse and to more readily perceive relationships in data. Treemaps are a method for displaying information about entities with a hierarchical relationship, in compactly in two dimensions (such as a computer monitor). Treemaps display rows of data as groups of squares that can be arranged, sized and coloured to graphically reveal underlying data patterns (Wikipedia, 2006). Treemaps work by dividing the display area into rectangles whose size corresponds to an attribute of the data set. Treemaps combine characteristics of Venn diagrams and pie charts (Bederson et al., 2002). Shniderman (1992) originally developed the concept of the treemap as a method of representing the multiple levels of directories and files contained on a computer hard drive. It was found that information about the location of files on a disk and their size could allow users to find and manipulate files on a hard drive more 504 easily then the textual listings of files given by the MS-DOS operating system at that time. Treemaps can be used to find relationships in construction cost and bidding data that would not be immediately obvious. This paper will demonstrate how treemaps can be used to visualize the relationship between characteristics of the bids for highway construction projects and the completed project cost. The bidding characteristics are defined by a series of ratios that define the spread and variations of the submitted bids. Applications of Treemaps in Construction Several applications of treemaps to construction problems have been reported in the literature. Songer et al. (2004) have applied treemaps as a way of visualizing cost overruns on an $18.8 million dollar construction project. Treemaps were used as a way of visually representing project cost items that were over and under budget. The treemap visualizations were tested against more traditional methods of providing cost data including a printout from a spreadsheet. Users were found to produce more accurate answers when viewing the treemap than the cost spreadsheet. Cable et al. (2004) have discussed how treemaps can be used to analyze performance for a portfolio of projects. Treemaps were constructed for 41 projects grouped by project life cycle phase. Each rectangle in the treemap represented a project, the size of the rectangle represented the projects size and the colour of the rectangles indicated the value of a performance metric. Three performance metrics were a cost index, a schedule index, and a critical index that represented a combination of schedule and cost performance. When index values indicated a problem the project was displayed in shades of red. While projects exceeding performance expectations appeared in shades of green. They concluded that linking earned value management with treemaps to visualize the performance of an entire portfolio has the potential to improve project portfolio management. Demain and Fruchter (2004) have used treemaps to provide an interface to a knowledge management system. The system functions as a corporate memory repository that provides users with links to knowledge about a company's previous designs. Information about projects, disciplines, and building components are shown as nested rectangles in a treemap. The size of each rectangle denotes the amount of content contained in that project, discipline, or component (number of versions, annotations, linked documents, etc.). The colour of each rectangle denotes that item’s relevance to the current design task based on text analysis. Asahi, Turo and Shneiderman (1994) have developed treemaps as a way of manipulating and visualizing the output of an Analytic Hierarchy Process analysis to determine if a particular site is suitable to build a dam. The treemaps were able to visually represent the hierarchy structure and enabled users to change various design parameters to visually assess their impact on the building decision. 2. BIDDING RATIOS To quantify the nature of the bids submitted for a construction project several ratios can be calculated. Williams (2005) has described five ratios that describe the nature of the submitted bids. These ratios were developed as a way of representing the relationships between bids for a project that are dimensionless and are not dependent 505 on the project magnitude. The rationale for the use of the bid ratios is that ratios describing the “signature” of the bids for a project can give clues about the projects likelihood to experience cost increases during construction. Potentially, bids that are closely bunched together or contain extreme outliers may give clues about the completed project cost. The ratios include the second lowest bid ratio, the mean bid ratio, the maximum bid ratio and the coefficient of variation of the submitted bids. The formulas used for the calculation of the ratios are given below. A ratio was calculated to compare the second lowest bid with the low bid amount. This ratio determines if the low bidder and next lowest bidder basically agree about the project cost. The ratio is given as: Second lowest bid ratio = ((Second Lowest Bid)-(Low Bid))/(Low Bid) Another ratio measures the difference between the low bid and the mean bid. It is given as: Mean bid ratio = ((Mean Bid)-(Low Bid))/(Low Bid) The mean bid ratio may indicate the degree of clustering of the bids. If the ratio of the low bid to the mean bid is large, it probably indicates a mistaken bid or a project where there is little agreement about costs. A median bid ratio was also calculated. It is given by: Median bid ratio = ((Median Bid)-(Low Bid)/(Low Bid) A ratio is also calculated relating the maximum bid to the low bid. The formula is: Maximum bid ratio = ((Maximum Bid)-Low Bid))/(Low Bid) This ratio indicates the spread of the submitted bids, and indicates if there is significant variation in the range of values of the submitted bids. It is also an indication of the existence of an extremely high bid. As a way of measuring the agreement between bidders the coefficient of variation can be calculated. The coefficient of variation is given by: Coefficient of Variation = s/x Where s equals the standard deviation of the bids submitted for a project, and x is the mean of the submitted bids. Essentially, the coefficient of variation is a measure of the spread of the submitted bids. Further research by Williams et al. (2005) has indicated that there is a statistically significant link between the level of the ratios and the completed project cost. Their study was conducted using highway bidding data from Texas. There was found to be statistically significant difference in the value of the bidding ratios between projects completed at a cost near the low bid amount and projects where the completed project cost differed significantly from the low bid amount. Higher values of the ratios are observed for projects completed with significant deviations from the mean. Projects completed near the original low bid amount tend to have lower values of the ratios. It was also noted that the elevated ratio values seem to occur for projects that have large cost increases and for projects that are completed for significantly less than the original bid amount. It was also found that it was difficult to develop accurate models 506 using neural networks and multiple linear regression that could predict actual values of project completed cost. It was also found that due to the noise in the bidding data it was difficult to construct regression or neural network models that could exploit knowledge of the bidding ratios to predict the magnitude of a project’s likely cost increase during construction. 3. DEVELOPMENT OF THE TREEMAPS Williams et al (2005) were unable to produce regression or neural network models that made accurate predictions of project cost overruns. In an effort to exploit the relationship Figure 1. Treemap showing Texas highway project data they found between the bidding ratios and a tendency towards higher levels of project cost overruns treemaps were studied to determine if they could provide a useful way of analysing the ratio values and providing an indicator of project cost overruns. Treemaps were constructed that separated the Texas projects into rectangles based on the level of the calculated bid ratios.
Recommended publications
  • Stable Treemaps Via Local Moves
    Stable treemaps via local moves Citation for published version (APA): Sondag, M., Speckmann, B., & Verbeek, K. A. B. (2018). Stable treemaps via local moves. IEEE Transactions on Visualization and Computer Graphics, 24(1), 729-738. [8019841]. https://doi.org/10.1109/TVCG.2017.2745140 DOI: 10.1109/TVCG.2017.2745140 Document status and date: Published: 01/01/2018 Document Version: Accepted manuscript including changes made at the peer-review stage Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.
    [Show full text]
  • Treemap User Guide
    TreeMap User Guide Macrofocus GmbH Version 2019.8.0 Table of Contents Introduction. 1 Getting started . 2 Load and filter the data . 2 Set-up the visualization . 5 View and analyze the data. 7 Fine-tune the visualization . 12 Export the result. 15 Treemapping . 16 User interface . 20 Menu and toolbars. 20 Status bar . 24 Loading data . 25 File-based data sources. 25 Directory-based data sources . 31 Database connectivity. 32 On-line data sources . 33 Automatic default configuration . 33 Data types . 33 Configuration panel . 36 Layout . 38 Group by. 53 Size. 56 Color . 56 Height . 61 Labels . 61 Tooltip. 63 Rendering . 66 Legend . 67 TreeMap view . 69 Zooming . 69 Drilling . 70 Probing and selection . 70 TreePlot view. 71 Configuration . 71 Zooming . 72 Drilling . 72 Probing and selection . 72 TreeTable view . 73 Sorting . 73 Probing and selection . 74 Filter on a subset . 75 Search . 76 Filter . 76 See details. 77 Configure variables . 78 Formatting patterns. 78 Expression.
    [Show full text]
  • Treemap Art Project
    EVERY ALGORITHM HAS ART IN IT Treemap Art Project By Ben Shneiderman Visit Exhibitions @ www.cpnas.org 2 tree-structured data as a set of nested rectangles) which has had a rippling impact on systems of data visualization since they were rst conceived in the 1990s. True innovation, by denition, never rests on accepted practices but continues to investigate by nding new In his book, “Visual Complexity: Mapping Patterns of perspectives. In this spirit, Shneiderman has created a series Information”, Manuel Lima coins the term networkism which of prints that turn our perception of treemaps on its head – an he denes as “a small but growing artistic trend, characterized eort that resonates with Lima’s idea of networkism. In the by the portrayal of gurative graph structures- illustrations of exhibition, Every AlgoRim has ART in it: Treemap Art network topologies revealing convoluted patterns of nodes and Project, Shneiderman strips his treemaps of the text labels to links.” Explaining networkism further, Lima reminds us that allow the viewer to consider their aesthetic properties thus the domains of art and science are highly intertwined and that laying bare the fundamental property that makes data complexity science is a new source of inspiration for artists and visualization eective. at is to say that the human mind designers as well as scientists and engineers. He states that processes information dierently when it is organized visually. this movement is equally motivated by the unveiling of new In so doing Shneiderman seems to daringly cross disciplinary is exhibit is a project of the knowledge domains as it is by the desire for the representation boundaries to wear the hat of the artist – something that has Cultural Programs of the National Academy of Sciences of complex systems.
    [Show full text]
  • Immersive Data Visualization and Storytelling Based on 3D | Virtual Reality Platform: a Study of Feasibility, Efficiency, and Usability
    Masterarbeit Truong Vinh Phan Immersive Data Visualization and Storytelling based on 3D | Virtual Reality Platform: a Study of Feasibility, Efficiency, and Usability Fakultät Technik und Informatik Faculty of Engineering and Computer Science Department Informatik Department of Computer Science Truong Vinh Phan Immersive Data Visualization and Storytelling based on 3D | Virtual Reality Platform: a Study of Feasibility, Efficiency, and Usability Masterarbeit eingereicht im Rahmen der Masterprüfung im Studiengang Angewandte Informatik am Department Informatik der Fakultät Technik und Informatik der Hochschule für Angewandte Wissenschaften Hamburg Betreuender Prüfer: Prof. Dr. Kai von Luck Zweitgutachter: Prof. Dr. Philipp Jenke Abgegeben am October 7, 2016 Truong Vinh Phan Thema der Masterarbeit Immersive Datenvisualisierung und Storytelling, die auf 3D bzw. virtueller Realität-Plattform basiert: eine Studie der Machbarkeit, Effizienz und Usability. Stichworte immersive Datenvisualisierung, 3D, visueller Data-Mining, virtuelle Realität, Open-Data, Big- Data, UX, Userbefragung Kurzzusammenfassung Seit der Datenexplosion dank der Open-Data- bzw. Transparenz-Bewegung sind Daten- analyse und -exploration eine zwar interessanter aber immer schwieriger Herausforderung, nicht nur für die Informationstechnik und Informatik sondern auch für unsere allgemeine Gesellschaft, geworden. Wegen der Arbeitsweise des menschlichen Gehirns ist Visual- isierung eine der ersten Go-to Methoden, um komplexe Datensätze verständlich, anschaulich und zugänglich zu machen.
    [Show full text]
  • Uncertainty Treemaps
    Uncertainty Treemaps Citation for published version (APA): Sondag, M., Meulemans, W., Schulz, C., Verbeek, K., Weiskopf, D., & Speckmann, B. (2020). Uncertainty Treemaps. In F. Beck, J. Seo, & C. Wang (Eds.), 2020 IEEE Pacific Visualization Symposium, PacificVis 2020 - Proceedings (pp. 111-120). [9086235] IEEE Computer Society. https://doi.org/10.1109/PacificVis48177.2020.7614 DOI: 10.1109/PacificVis48177.2020.7614 Document status and date: Published: 01/06/2020 Document Version: Accepted manuscript including changes made at the peer-review stage Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.
    [Show full text]
  • UNIVERSITY of CALIFORNIA SANTA CRUZ PLAYING with WORDS: from INTUITION to EVALUATION of GAME DIALOGUE INTERFACES a Dissertation
    UNIVERSITY OF CALIFORNIA SANTA CRUZ PLAYING WITH WORDS: FROM INTUITION TO EVALUATION OF GAME DIALOGUE INTERFACES A dissertation submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in COMPUTER SCIENCE Serdar Sali December 2012 The Dissertation of Serdar Sali is approved: _______________________________ Professor Michael Mateas, Chair _______________________________ Associate Professor Noah Wardrip-Fruin _______________________________ Associate Professor Sri Kurniawan _______________________________ Professor Marilyn Walker _____________________________ Tyrus Miller Vice Provost and Dean of Graduate Studies Copyright © by Serdar Sali 2012 TABLE OF CONTENTS CHAPTER 1. INTRODUCTION ........................................................................................ 1 OBJECTIVES ........................................................................................................................... 3 CONTRIBUTIONS ................................................................................................................... 4 ORGANIZATION .................................................................................................................... 5 CHAPTER 2. RELATED WORK ......................................................................................... 7 DIALOGUE IN TASK-BASED SYSTEMS ................................................................................... 7 DIALOGUE SYSTEMS FOR VIRTUAL AND BELIEVABLE AGENTS ........................................... 9 DIALOGUE
    [Show full text]
  • Issn –2395-1885 Issn
    IJMDRR Research Paper E- ISSN –2395-1885 Impact Factor: 4.164 Refereed Journal ISSN -2395-1877 HIERARCHICAL VISUALIZATION METHOD FOR MULTIDIMENSIONAL RELATIONAL DATA SET USING NR-PERFECT TREEMAPPING K. Kalasha* G. Nirmala** *Head, PG and Research Department of Computer Science, Government Arts College, Thiruvannamalai. **Research Scholar, Government Arts College, Thiruvannamalai. Abstract This paper describes multidimensional relational data sets visualization by using hierarchical method for enhanced treemapping. Many ideas behind by introducing a variety of interactive techniques for space optimization, rectangle overlapping and gaps and adjusting treemaps [1,2]. In this paper, we present strategies to visualize changes of hierarchical data using treemaps. A new NR-Perfect treemapping algorithm is presented to abrupt above all of these limitations. NR- Perfect treemapping algorithm would create rectangles with an aspect ratio close to one. The given size to form treemapping a rectangle can be formed in a big small split of a rectangle. Each rectangle followed by clearly few items of color, size, and position and represents a rectangle using graph based regions [3]. In this rectangle is cut out of a rectangle by substituting the values for T and it can be shown easily D the target aspect ratio is met. NR-Perfect treemapping well known treemap visualization in order to guarantee layouts with constant aspect ratio and it has effective power. When you implement this algorithm it satisfies many conditions and the implementation of NR-Perfect treemapping concept using python code. Keywords: NR-Perfect (Node vs Rectangle) Treemapping, High Dimensional Data, Clustering, Multidimensional Relational Database. I. Introduction In this real world, to store and represent a huge amount of dataset into the database, we are in the need of multidimensional relational database.
    [Show full text]
  • Treemaps for Space-Constrained Visualization of Hierarchies
    Treemaps for space-constrained visualization of hierarchies by Ben Shneiderman Started Dec. 26th, 1998, last updated June 25th, 2009 by Catherine Plaisant Our treemap products: Treemap 4.0: General treemap tool (Free demo version, plus licensing information for full package) PhotoMesa: Zoomable image library browser (Free demo version, plus licensing information for full package) Treemap Algorithms and Algorithm Animations (Open source Java code) A History of Treemap Research at the University of Maryland During 1990, in response to the common problem of a filled hard disk, I became obsessed with the idea of producing a compact visualization of directory tree structures. Since the 80 Megabyte hard disk in the HCIL was shared by 14 users it was difficult to determine how and where space was used. Finding large files that could be deleted, or even determining which users consumed the largest shares of disk space were difficult tasks. Tree structured node-link diagrams grew too large to be useful, so I explored ways to show a tree in a space-constrained layout. I rejected strategies that left blank spaces or those that dealt with only fixed levels or fixed branching factors. Showing file size by area coding seemed appealing, but various rectangular, triangular, and circular strategies all had problems. Then while puzzling about this in the faculty lounge, I had the Aha! experience of splitting the screen into rectangles in alternating horizontal and vertical directions as you traverse down the levels. This recursive algorithm seemed attractive, but it took me a few days to convince myself that it would always work and to write a six line algorithm.
    [Show full text]
  • A Visual Analysis Tool for Medication Use Data in the ABCD Study
    Eurographics Workshop on Visual Computing for Biology and Medicine (2019) Short Paper K. Lawonn and R. G. Raidou (Editors) MedUse: A Visual Analysis Tool for Medication Use Data in the ABCD Study H. Bartsch1,2 and L. Garrison5 and S. Bruckner5 and A. (Szu-Yung) Wang4 and S. F. Tapert3 and R. Grüner1 1Mohn Medical Imaging and Visualization Centre, Haukeland University Hospital, Bergen, Norway 2Center for Multimodal Imaging and Genetics, University of California San Diego, La Jolla, United States 3Department of Psychiatry, University of California San Diego, La Jolla, California, United States 4Laboratory of Neuroimaging, National Institute on Alcohol Abuse and Alcoholism, Bethesda, Maryland, United States 5Department of Informatics, University of Bergen, Norway 58930 Zyrtec #209 1186679 Zyrtec Pill #29 865258 Aller Tec #18 203150 cetirizine hydrochloride #15 1186677 Zyrtec Oral Liquid Product #6 1152447 Cetirizine Pill #5 1086791 Wal Zyr #5 1366498 Ahist Antihistamine Oral Product #4 1428980 Cetirizine Oral Solution [Pediacare Children's 24 Hr Allergy] #3 1020026 cetirizine hydrochloride 10 MG Oral Tablet [Zyrtec] #3 1020023 cetirizine hydrochloride 10 MG Oral Capsule [Zyrtec] #3 371364 Cetirizine Oral Tablet #3 1366499 Ahist Antihistamine Pill #2 1296197 Zyrtec Disintegrating Oral Product #2 R06AE Piperazine derivatives #209 1186680 Zyrtec D Oral Product #2 1020021 cetirizine hydrochloride 1 MG/ML Oral Solution [Zyrtec] #2 759919 levocetirizine Oral Solution #2 398335 Xyzal #2 1595661 1595661 #1 ine] #1 1296338 Zyrtec Chewable Product
    [Show full text]
  • Hsuanwei Michelle Chen
    ALAAmericanLibraryAssociation INFORMATION VISUALIZATION Hsuanwei Michelle Chen APRIL 2017 Library Technology Reports Vol. 53 / No. 3 Expert Guides to Library Systems and Services ISSN 0024-2586 Library Technology R E P O R T S Expert Guides to Library Systems and Services Information Visualization Hsuanwei Michelle Chen alatechsource.org American Library Association About the Author Library Technology Dr. Hsuanwei Michelle Chen is an Assistant Professor REPORTS in the School of Information at San José State Univer- sity. Her primary areas of research and teaching interests ALA TechSource purchases fund advocacy, awareness, and include data mining, information visualization, social accreditation programs for library professionals worldwide. network analysis, and online user behavior. In partic- Volume 53, Number 3 ular, she is interested in studying the value of virtual Information Visualization platforms, social media, and networked environments, ISBN: 978-0-8389-5986-2 and the role they play in shaping online user behavior. She has published on the topics of information visualiza- American Library Association tion, social media, and online user behavior in journals 50 East Huron St. Chicago, IL 60611-2795 USA such as Journal of Information Technology Management, alatechsource.org Library Management, Information Systems Research, and 800-545-2433, ext. 4299 312-944-6780 Decision Support Systems. She holds a BS and MS in com- 312-280-5275 (fax) puter science and information engineering from National Taiwan University and a PhD in information systems Advertising Representative from the University of Texas at Austin, McCombs School Samantha Imburgia [email protected] of Business. 312-280-3244 Editors Patrick Hogan [email protected] Abstract 312-280-3240 Samantha Imburgia Information visualization has been widely adopted [email protected] as both an analytical tool and an aid to enhance and 312-280-3244 shape data interpretation and knowledge discovery in Copy Editor disciplines ranging from computer science to humani- Judith Lauber ties.
    [Show full text]
  • Conceptualizing an Interactive Graphical Interface
    CORE Metadata, citation and similar papers at core.ac.uk Provided by Universidade do Minho: RepositoriUM Interfaces for Science: Conceptualizing an Interactive Graphical Interface How to cite Azevedo, B., Baptista, A. A., Oliveira e Sá, J., Branco, P., & Tortosa, R. (2019). Interfaces for Science: Conceptualizing an Interactive Graphical Interface. In A. Brooks, E. Brooks, & C. Sylla (Eds.), 7th EAI Interna- tional Conference, ArtsIT 2018, and 3rd EAI International Conference, DLI 2018, ICTCC 2018, Braga, Portugal, October 24–26, 2018, Pro- ceedings (pp. 17–27). https://doi.org/10.1007/978-3-030-06134-0_3 Where to find http://hdl.handle.net/1822/58860 Project Reference POCI-01-0145-FEDER-028284 Acknowledgements This work has been supported by COMPETE: POCI-01-0145-FEDER- 007043 and FCT - Fundação para a Ciência e Tecnologia within the Project Scope: (UID/CEC/00319/2013) and the Project IViSSEM: ref: POCI-01- 0145-FEDER-28284. 2 Interfaces for Science: Conceptualizing An Interactive Graphical Interface Bruno Azevedo1 [0000-0003-1494-4726], Ana Alice Baptista1 [0000-0003-3525-0619], Jorge Oliveira e Sá1 [000-0003-4095-3431], Pedro Branco1 [0000-0002-6707-6806], Rubén Tortosa2 [0000-0003-1500-7697] 1 ALGORITMI Research Centre, University of Minho, Guimarães, Portugal 2 Polytechnic University of Valencia, Valencia, Spain [email protected]; {analice;jos;pbranco}@dsi.uminho.pt; [email protected] Abstract. 6,849.32 new research journal articles are published every day. The exponential growth of Scientific Knowledge Objects (SKOs) on the Web, makes searches time-consuming. Access to the right and relevant SKOs is vital for re- search, which calls for several topics, including the visualization of science dy- namics.
    [Show full text]
  • Discovery with Data: Leveraging Statistics with Computer Science to Transform Science and Society
    Discovery with Data: Leveraging Statistics with Computer Science to Transform Science and Society July 2, 2014 A Working Group of the American Statistical Association 1 Summary : The Big Data Research and Development Initiative is now in its third year and making great strides to address the challenges of Big Data. To further advance this initiative, we describe how statistical thinking can help tackle the many Big Data challenges, emphasizing that often the most productive approach will involve multidisciplinary teams with statistical, computational, mathematical, and scientific domain expertise. With a major Big Data objective of turning data into knowledge, statistics is an essential scientific discipline because of its sophisticated methods for statistical inference, prediction, quantification of uncertainty, and experimental design. Such methods have helped and will continue to enable researchers to make discoveries in science, government, and industry. The paper discusses the statistical components of scientific challenges facing many broad areas being transformed by Big Data—including healthcare, social sciences, civic infrastructure, and the physical sciences—and describes how statistical advances made in collaboration with other scientists can address these challenges. We recommend more ambitious efforts to incentivize researchers of various disciplines to work together on national research priorities in order to achieve better science more quickly. Finally, we emphasize the need to attract, train, and retain the next generation
    [Show full text]