Using Treemaps As a Predictive Indicator of Project Cost Overruns
Total Page:16
File Type:pdf, Size:1020Kb
USING TREEMAPS AS A PREDICTIVE INDICATOR OF PROJECT COST OVERRUNS Trefor P. Williams [email protected] Department of Civil and Environmental Engineering, Rutgers University, 623 Bowser Road, Piscataway, New Jersey 08854, USA ABSTRACT Treemaps are a method of data visualization that allows complex data sets to be studied without resorting to complex statistical procedures. Treemaps were applied to bidding data to study the relationship betting bidding ratios and cost increases on highway projects constructed in the states of Texas and California. The bidding ratios were used to identify the nature of submitted bids by measuring the spread of the bids, and the existence of outlier bids. The treemaps indicated that projects with high ratio values typically experienced a larger weighted average percentage difference between the low bid and completed project cost than projects with low ratio values when the weighting factor was the magnitude of the project low bid. The treemap analysis also indicated that increasing numbers of bidders also affects the tendency for project costs to increase. Keywords: bidding, data visualization, highway projects 1. INTRODUCTION Data visualization is an emerging technology that can allow users to more easily discern relationships in complex data sets. One data visualization technique is treemaps. Treemaps can be extremely useful in the construction industry because development of treemaps and their understanding does not require extensive statistical knowledge. Busy construction contractors may not have the time or knowledge to construct regression models to define relationships between cost data. With treemaps they can more easily visualize relationships between the data. Treemap software is now available with an easily understandable user interface that allows users to visually analyse and to more readily perceive relationships in data. Treemaps are a method for displaying information about entities with a hierarchical relationship, in compactly in two dimensions (such as a computer monitor). Treemaps display rows of data as groups of squares that can be arranged, sized and coloured to graphically reveal underlying data patterns (Wikipedia, 2006). Treemaps work by dividing the display area into rectangles whose size corresponds to an attribute of the data set. Treemaps combine characteristics of Venn diagrams and pie charts (Bederson et al., 2002). Shniderman (1992) originally developed the concept of the treemap as a method of representing the multiple levels of directories and files contained on a computer hard drive. It was found that information about the location of files on a disk and their size could allow users to find and manipulate files on a hard drive more 504 easily then the textual listings of files given by the MS-DOS operating system at that time. Treemaps can be used to find relationships in construction cost and bidding data that would not be immediately obvious. This paper will demonstrate how treemaps can be used to visualize the relationship between characteristics of the bids for highway construction projects and the completed project cost. The bidding characteristics are defined by a series of ratios that define the spread and variations of the submitted bids. Applications of Treemaps in Construction Several applications of treemaps to construction problems have been reported in the literature. Songer et al. (2004) have applied treemaps as a way of visualizing cost overruns on an $18.8 million dollar construction project. Treemaps were used as a way of visually representing project cost items that were over and under budget. The treemap visualizations were tested against more traditional methods of providing cost data including a printout from a spreadsheet. Users were found to produce more accurate answers when viewing the treemap than the cost spreadsheet. Cable et al. (2004) have discussed how treemaps can be used to analyze performance for a portfolio of projects. Treemaps were constructed for 41 projects grouped by project life cycle phase. Each rectangle in the treemap represented a project, the size of the rectangle represented the projects size and the colour of the rectangles indicated the value of a performance metric. Three performance metrics were a cost index, a schedule index, and a critical index that represented a combination of schedule and cost performance. When index values indicated a problem the project was displayed in shades of red. While projects exceeding performance expectations appeared in shades of green. They concluded that linking earned value management with treemaps to visualize the performance of an entire portfolio has the potential to improve project portfolio management. Demain and Fruchter (2004) have used treemaps to provide an interface to a knowledge management system. The system functions as a corporate memory repository that provides users with links to knowledge about a company's previous designs. Information about projects, disciplines, and building components are shown as nested rectangles in a treemap. The size of each rectangle denotes the amount of content contained in that project, discipline, or component (number of versions, annotations, linked documents, etc.). The colour of each rectangle denotes that item’s relevance to the current design task based on text analysis. Asahi, Turo and Shneiderman (1994) have developed treemaps as a way of manipulating and visualizing the output of an Analytic Hierarchy Process analysis to determine if a particular site is suitable to build a dam. The treemaps were able to visually represent the hierarchy structure and enabled users to change various design parameters to visually assess their impact on the building decision. 2. BIDDING RATIOS To quantify the nature of the bids submitted for a construction project several ratios can be calculated. Williams (2005) has described five ratios that describe the nature of the submitted bids. These ratios were developed as a way of representing the relationships between bids for a project that are dimensionless and are not dependent 505 on the project magnitude. The rationale for the use of the bid ratios is that ratios describing the “signature” of the bids for a project can give clues about the projects likelihood to experience cost increases during construction. Potentially, bids that are closely bunched together or contain extreme outliers may give clues about the completed project cost. The ratios include the second lowest bid ratio, the mean bid ratio, the maximum bid ratio and the coefficient of variation of the submitted bids. The formulas used for the calculation of the ratios are given below. A ratio was calculated to compare the second lowest bid with the low bid amount. This ratio determines if the low bidder and next lowest bidder basically agree about the project cost. The ratio is given as: Second lowest bid ratio = ((Second Lowest Bid)-(Low Bid))/(Low Bid) Another ratio measures the difference between the low bid and the mean bid. It is given as: Mean bid ratio = ((Mean Bid)-(Low Bid))/(Low Bid) The mean bid ratio may indicate the degree of clustering of the bids. If the ratio of the low bid to the mean bid is large, it probably indicates a mistaken bid or a project where there is little agreement about costs. A median bid ratio was also calculated. It is given by: Median bid ratio = ((Median Bid)-(Low Bid)/(Low Bid) A ratio is also calculated relating the maximum bid to the low bid. The formula is: Maximum bid ratio = ((Maximum Bid)-Low Bid))/(Low Bid) This ratio indicates the spread of the submitted bids, and indicates if there is significant variation in the range of values of the submitted bids. It is also an indication of the existence of an extremely high bid. As a way of measuring the agreement between bidders the coefficient of variation can be calculated. The coefficient of variation is given by: Coefficient of Variation = s/x Where s equals the standard deviation of the bids submitted for a project, and x is the mean of the submitted bids. Essentially, the coefficient of variation is a measure of the spread of the submitted bids. Further research by Williams et al. (2005) has indicated that there is a statistically significant link between the level of the ratios and the completed project cost. Their study was conducted using highway bidding data from Texas. There was found to be statistically significant difference in the value of the bidding ratios between projects completed at a cost near the low bid amount and projects where the completed project cost differed significantly from the low bid amount. Higher values of the ratios are observed for projects completed with significant deviations from the mean. Projects completed near the original low bid amount tend to have lower values of the ratios. It was also noted that the elevated ratio values seem to occur for projects that have large cost increases and for projects that are completed for significantly less than the original bid amount. It was also found that it was difficult to develop accurate models 506 using neural networks and multiple linear regression that could predict actual values of project completed cost. It was also found that due to the noise in the bidding data it was difficult to construct regression or neural network models that could exploit knowledge of the bidding ratios to predict the magnitude of a project’s likely cost increase during construction. 3. DEVELOPMENT OF THE TREEMAPS Williams et al (2005) were unable to produce regression or neural network models that made accurate predictions of project cost overruns. In an effort to exploit the relationship Figure 1. Treemap showing Texas highway project data they found between the bidding ratios and a tendency towards higher levels of project cost overruns treemaps were studied to determine if they could provide a useful way of analysing the ratio values and providing an indicator of project cost overruns. Treemaps were constructed that separated the Texas projects into rectangles based on the level of the calculated bid ratios.