Predicting Construction Cost and Schedule Success Using Artificial

Available online at www.sciencedirect.com International Journal of Project Management 30 (2012) 470–478 www.elsevier.com/locate/ijproman Predicting construction cost and schedule success using artificial neural networks ensemble and support vector machines classification models ⁎ Yu-Ren Wang , Chung-Ying Yu, Hsun-Hsi Chan Dept. of Civil Engineering, National Kaohsiung University of Applied Sciences, 415 Chien-Kung Road, Kaohsiung, 807, Taiwan Received 11 May 2011; received in revised form 2 August 2011; accepted 15 September 2011 Abstract It is commonly perceived that how well the planning is performed during the early stage will have significant impact on final project outcome. This paper outlines the development of artificial neural networks ensemble and support vector machines classification models to predict project cost and schedule success, using status of early planning as the model inputs. Through industry survey, early planning and project performance information from a total of 92 building projects is collected. The results show that early planning status can be effectively used to predict project success and the proposed artificial intelligence models produce satisfactory prediction results. © 2011 Elsevier Ltd. APM and IPMA. All rights reserved. Keywords: Project success; Early planning; Classification model; ANNs ensemble; Support vector machines 1. Introduction Menches and Hanna, 2006). In particular, researches have indicated that project definition in the early planning process is an im- In the past few decades, the researchers and industry prac- portant factor leading to project success (Le et al., 2010; Thomas titioners have recognized the potential impact of early plan- and Fernández, 2008; Younga and Samson, 2008). Based on ning to final project outcomes and started to put more these results, this research intends to further investigate this rela- emphasis on early planning process (Dvir, 2005; Gibson et tionship and to examine if the status of early planning can be used al., 2006; Hartman and Ashrafi, 2004). In particular, the Con- to predict final project outcomes. struction Industry Institute (CII, a consortium of more than To achieve this goal, a scope definition tool, Project Defi- 100 leading owner, engineering-contractor, and supplier nition Rating Index (PDRI), is adopted in this research to firms from both the public and private arenas based at the Uni- evaluate the completeness of project scope definition during versity of Texas at Austin) has constituted several research the early planning stage. As an easy-to-use tool developed projects focusing on the topics of early planning since the by the Construction Industry Institute (CII), the PDRI is a early 1990s. Their research results have indicated that early comprehensive, weighted checklist of crucial scope definition planning is a key process in the project life cycle and how elements that have to be addressed in the early planning pro- well early planning is performed will affect cost and schedule cess (Gibson and Dumont, 1996). The PDRI for Building Pro- performance (CII, 1995, 1999, 2006). Results provided by jects consists of 64 elements, which are grouped into 11 other researchers also confirm that better early planning will categories and further grouped into three main sections. The improve efficiency and thus lead to profitability (Gigado, 2004; 64 elements are arranged in a score sheet format and sup- ported by 38 pages of detailed descriptions and checklists. For illustration purposes, Section I — Category A of the PDRI for Building Projects (both elements and their weights) ⁎ Corresponding author. is shown in Fig. 1 (CII, 1999). Designed in a score sheet for- E-mail addresses: [email protected] (Y.-R. Wang), mat, the PDRI can be used to measure the status of early pro- [email protected] (C.-Y. Yu), [email protected] (H.-H. Chan). ject planning and a score is obtained after the evaluation. With 0263-7863/$ - see front matter © 2011 Elsevier Ltd. APM and IPMA. All rights reserved. doi:10.1016/j.ijproman.2011.09.002 Y.-R. Wang et al. / SciVerse ScienceDirect 30 (2012) 470–478 471 Fig. 1. PDRI for building projects—category A. the maximum score of 1000 points, the PDRI is designed in a (bootstrap aggregating and adaptive boosting) and SVMs way that a lower score indicating a better-defined project prediction models. These artificial intelligence techniques scope. After its introduction, the PDRI has been widely used are briefly introduced in the following sections. by the construction industry, especially within CII member companies, to assist with their early planning process (Gibson et al., 2006). Proven as a successful evaluation tool, the PDRI 2.1. Bootstrap aggregating neural networks is incorporated in the survey questionnaire for this study to collect early planning related information from the building The Bootstrap Aggregating (also known as Bagging) neural construction industry in Taiwan. In the mean time, the project networks model generates an aggregated ANNs predictor outcomes are measured by comparing its original cost and using multiple sets of artificial neural networks. Sets of training schedule estimates with final project cost and schedule. data are generated from bootstrap replicates of the learning set As demonstrated in previous researches, statistical analy- and are used to train ANNs models. If the aggregation predic- sis techniques, such as linear and logistic regression, and tion outcome is numerical, an average is taken for the predic- artificial intelligence (i.e. neural networks) are successfully tion outcomes from the multiple sets of ANNs models. On the applied for project performance prediction (Berlin et al., other hand, if the prediction outcome is categorical, a plurality 2009; Kim et al., 2009; Ko and Cheng, 2007; Ling et al., vote is conducted to generate the prediction for the aggregation. 2008). Among them, artificial intelligence (neural networks) For this study, the model prediction result, project success, is models produce better prediction results than those obtained obtained by plurality vote from the ANNs classifiers. Previous from regression models (Wang and Gibson, 2010). To further research has shown that bagging can give substantial gains in explore the prediction capability of artificial intelligence accuracy after tests on real and simulated data sets (Breiman, techniques, this research uses modified neural networks (neu- 1996). Through the years, bootstrap aggregating neural networks ral networks ensemble) and support vector machines (SVMs) have been applied to areas such as pattern recognition (Rowley to develop project success classification models. Classifica- et al., 1996), non-linear modeling (Franke and Neumann, 2000; tion models are built and tested with data collected from a Zhang, 1999), regression and classification (Kleinbaum et al., total of 92 building construction projects in Taiwan. 2002; Zhou et al., 2002). It is found that models built from bootstrap aggregated neural networks are more accurate and robust 2. Research methodology than those built from single neural networks. As a result, this research applies the bootstrap aggregating neural networks to In order to investigate the relationship between the status model the project success. The bootstrap aggregating neural net- of early planning (as measured by the PDRI evaluation) and works development process is briefly described below. project success, industry data is collected through question- A learning set of L is taken from the data set P and the naire survey. From late 2007 to early 2010, early planning remaining data form the testing set, T (difference between P and project success information is collected from a total of and L). The replicate sub-dataset L(B), each consisting m 92 building projects, representing a total construction cost cases, are drawn randomly from the learning set L with replace- of approximately 1.1 billion U.S.D. The sample covers a ment. A total of n sub-datasets are drawn from the bootstrap wide variety of building projects including schools, houses, distribution approximating the distribution underlying L. apartment buildings, hospitals, offices, temples, recreational These n sub-datasets are then fed into n individual ANNs facilities, hotels, and department stores. The collected infor- models as training data. The aggregating prediction outcome mation is used to build and test neural networks ensemble for these n models is obtained by plurality voting, as illustrated 472 Y.-R. Wang et al. / SciVerse ScienceDirect 30 (2012) 470–478 Fig. 2. Bootstrap aggregating neural networks. in Fig. 2. Finally, the overall bootstrap aggregating model pre- weights are adjusted after each round of training. Weights of in- diction accuracy is examined by the testing dataset, T. correctly classified examples are increased so that the classifier is forced to focus on the hard (or misclassified) examples in the 2.2. Adaptive boosting neural networks training set in the following round of training. The final or com- bined classifier is a weighted majority vote of the multiple clas- Introduced by Freund and Schapire (1997), AdaBoost sifiers where classifiers with higher accuracy are assigned (Adaptive Boosting) algorithm incorporates the ensemble con- higher weights in the vote (Schapire, 2002). It is shown by cept that the final classifier is aggregated from multiple classi- demonstration in real-world applications that AdaBoost can fiers by voting, which is similar to Bagging. Nevertheless,

Predicting Construction Cost and Schedule Success Using Artificial

Ranking and Automatic Selection of Machine Learning Models Abstract Sandro Feuz

A Robust Deep Learning Approach for Spatiotemporal Estimation of Satellite AOD and PM2.5

An Evaluation of Machine Learning Approaches to Natural Language Processing for Legal Text Classification

A Taxonomy of Massive Data for Optimal Predictive Machine Learning and Data Mining Ernest Fokoue

A Comparison of Artificial Neural Networks and Bootstrap

Bagging and the Bayesian Bootstrap

Tensor Ensemble Learning for Multidimensional Data

Some Enhancements of Decision Tree Bagging

Bootstrap Aggregating and Random Forest

Accelerated Training of Bootstrap Aggregation-Based Deep Information Extraction Systems from Cancer Pathology Reports

Convolutional Neural Network-Based Ensemble Methods to Recognize Bangla Handwritten Character

SAS Visual Data Mining and Machine Learning Procedures