Construction Productivity Estimation Model Using Artificial Neural Network for Founda-tions Works in Gaza Strip Construction Sites

Estimating the construction labor productivity con-sidering the effect of multiple factors is important for construction planning, scheduling and estimating. In planning and scheduling, it is important to maximize labor productivity and forecast activity durations to achieve lower labor cost and shorter project duration. In estimating, it is important to predict labor costs.The aim of this study is to develop a new technique for estimating labor productivity rate for foundation works in (m3/ day) for building projects in Gaza Strip, through developing a model that is able to help par-ties involved in construction projects (owner, contrac-tors, and others) especially contracting companies to estimating labor productivity rate for foundation works . This model build based on Artificial Neural Networks. In order to build this model, quantitative and qualitative techniques were utilized to identify the significant parameters for estimating labor productivity rate for foundation works. The data used in model development was collected using questioner survey as a tool to collect actual data from contrac-tors for many projects in Gaza Strip. These question-naires provided 111 examples.The ANN model consid-ered 16 significant parameters as independent input variables affected on one dependent output variable “labor productivity rate for foundation works in (m3/ day). Neurosolution software was used to train the models. Many models were built but GFF model was found the best model, which structured from one input layer, included 16 input neurons, and included one hidden layer with 22 neurons. The accuracy perfor-mance of the adopted model recorded 98% where the model performed well and no significant difference was discerned between the estimated output and the actual productivity value.Sensitivity analysis was per-formed using Neurosolution tool to study the influ-ence of adopted factors on labor productivity. The performed sensitivity analysis was in general logically where the “Footings Volume” had the highest influ-ence, while the unexpected result was “Payment de-lay” factor which hadn't any effect on productivity of foundation works.


IV. FACTORS AFFECTING CONSTRUCTION PRODUCTIVITY ESTIMATION
In fact, one of the most significant keys in building the neural network model is identifying the factors that have real impact on the productivity estimation for foundation works. Depending on this great importance of selecting these factors, several techniques were adopted carefully to identify these factors in Gaza Strip building projects; as reviewing literature studies, and Delphi technique by conducting expert interviews.

V.
DELPHI TECHNIQUE Different technique has been used to determine the effective factors on the productivity estimation for founda-tion works. This technique relies on the concept of Delphi technique, which aimed to achieve a convergence of opinion on factors affecting the productivity estimation for foundation works. It provides feedback to experts in the form of distributions of their opinions and reasons. Then, they are asked to revise their opinions in light of the information contained in the feedback. This sequence of questionnaire and revision is repeated until no further significant opinion changes are expected [2]. For Delphiprocess, several rounds should be conducted where first round begins with an open-ended questionnaire. The open-ended questionnaire serves as the cornerstone of soliciting specific information about a content area from the Delphi subjects, then after receiving the responses, the researcher converts the collected information into a well structured questionnaire to be used as the survey instrument for the second round of data collection. In the second round, each Delphi participant receives a second questionnaire and is asked to review the items summarized by the investigators based on the information provided in the first round, where in this round areas of disagreement and agreement are identified. However, in third round Delphi panelists are asked to revise his/her judgments or to specify the reasons for remaining outside the consensus. In the fourth and often final round, the list of remaining items, their ratings, minority opinions, and items achieving consensus are distributed to the panelists. This round provides a final opportunity for participants to revise their judgments. Accordingly, the number of Delphi iterations depends largely on the degree of consensus sought by the investigators and can vary from three to five [3]. Five experts in construction field were selected to reach a consensus about specifying the key parameters. The results with those five experts were significantly close to the questionnaire results, and only three rounds were conducted due to largely degree of consensus, where they proposed to exclude retaining wall and curtain wall from these factors because of their rarity in Gaza's projects.

VI.
STRUCTURE DESIGN The choice of ANN architecture depends on a number of factors such as the nature of the problem, data characteristics and complexity, the numbers of sample data … etc.
[4]. With the 16 inputs readily identified, the outputs describing the estimation of productivity for foundation works (m3/day) can be modeled in different ways. The choice of artificial neural network in this study is based on prediction using feedforward neural network architectures and backpropagation learning technique. The design of the neural network architecture is a complex and dynamic process that requires the determination of the internal structure and rules (i.e., the number of hidden layers and neurons update weights method, and the type of activation function) [5]. A common recommendation is to start with a single hidden layer. In fact, unless the researcher is sure that the data is not linearly separable, he may want to start without any hidden layers. The reason is that networks train progressively slower when layers are added [6]. Based on the literature review, the neural network type deemed suitable for productivity estimation has been identified as feedforward pattern recognition type (Back propagation) to suit the desired interpolative and predictive performance of the model. Two kinds of feed-forward patterns were chosen to build the models multilayer perceptron and general feed forward. ANN architecture was chosen after several trials.

VII. MODEL IMPLEMENTATION
Once there is a clear idea about feasible structures and the information needed to be elicited, the implementation phase starts with knowledge acquisition and data preparation [7].The flow chart for model structure is show in Figure1   Fig.1: Model implementation steps flowchart 7.1. Data Encoding Artificial networks only deal with numeric input data. Therefore, the raw data must often be converted from the external environment to numeric form [8]. This may be challenging because there are many ways to do it and unfortunately, some are better than others are for neural network learning [6]. In this research, the data is textual and numeric, so it is encoded to be only numeric or integer according to Table 1.

Data Organization
Initially, the first step in implementing the neural network model in NeuroSolution application is to organize the Neurosolution excel spreadsheet as shown at Figure 2. Figure 2 shows a snapshot of the Excel program that represents part of the data matrix. Then, specifying the input factors that have been already encoded, which consist of 16 factors; Area of the building(m2), footings type, footings volume, method of casting concrete, number of La-bor, material shortages, tool and equipment shortages and efficiency, labor experiences, duration of formwork and casting footings, working hours per day at site, weather, complexity due to steel bars, drawings and specifications alteration during execution, easy to arrive to the project location, lack of labor surveillance, and payment delay . The desired parameter (output) is Labor productivity by (M3/day).

International Journal of Advanced Engineering Research and Science (IJAERS)
[

Data Set
The available data were divided into three sets namely; training set, cross-validation set and test set [9]. Training and cross validation sets are used in learning the model through utilizing training set in modifying the network weights to minimize the network error, and monitoring this error by cross validation set during the training process. However, test set does not enter in the training process and it hasn't any effect on the training process, where it is used for measuring the generalization ability of the network, and evaluated network performance [10].
In the present study, the total available data is 111 exemplars that are divided randomly into three sets with the following ratio: -Training set (includes 83 exemplars ≈ 75%).
-Test set (includes 13 exemplars ≈ 11%). See Figure 3 and 4 which explain how the data was distributed into sets and defined each exemplar for the corresponding.

Building Network
Once all data were prepared, then the subsequent step is represented in creating the initial network by selecting the network type, number of hidden layer/nodes, transfer function, learning rule, and number of epochs and runs. An initial neural network was built by selecting the type of network, number of hidden layers/nodes, transfer function, and learning rule. However, before the model becomes ready, a supervised learning control was checked to specify the maximum number of epochs and the termination limits, Figure 5 presents the initial network of Multilayer Perception (MLP) network that consists of one input, hidden, and output layer.

Fig.5: Multilayer Perceprtorn (MLP) network
Before starting the training phase, the normalization of training data is recognized to improve the performance of trained networks by Neurosolution program which as shown in Figure 6 which ranging from (0 to +0.9).

Model Training
The objective of training neural network is to get a network that performs best on unseen data through training many networks on a training set and comparing the errors of the networks on the validation set [11]. Therefore, several network parameters such as number of hid-den layers, number of hidden nodes, transfer functions and learning rules were trained multiple times to produce the best weights for the model. As a preliminary step to filter the preferable neural network type, a test process was applied for most of available networks in the application. Two types Multilayer Perceptron (MLP) and General feed Forward (GFF) networks were chosen to be focused in following training process due to their good initial results. It is worthy to mention that, previous models that have been applied in the field of estimating productivity of foundation works by neural networks used earlier two types of networks because of giving them the best outcome. The following chart illustrates the procedures of training process to obtain the best model having the best weight and minimum error percentage.

Fig.7: the procedures of training process
The chart shows the procedures of the model training, which starts with selecting the neural network type either MLP or GFF network. For each one, five types of learning rules were used, and with every learning rule six types of transfer functionswere applied, and then one separate hidden layers were utilized with increment of hidden nodes from 1 node up to 40 nodes this layer. By another word, thousand trials contain 40 variable hidden nodes for each was executed to obtain the best model of neural network. Figure 8 clarifies training variables for one trial. It compromises of number of epochs, runs, hidden nodes, and other training options. Ten runs in each one 3000 epochs were applied, where a run is a complete presentation of 3000 epochs, each epoch is a one complete presentation of all of the data [6].

Fig.8: Training options in Neurosolution application
However, in each run, new weights were applied in the first epoch and then the weights were adjusted to minimize the percentage of error in other epochs. To avoid overtraining for the network during the training process, an option of using cross-validation was selected, which computes the error in a cross validation set at the same time that the network is being trained with the training set. The model was started with one hidden layer and

Model Results
As mentioned above, the purpose of testing phase of ANN model is to ensure that the developed model was successfully trained and generalization is adequately achieved, through a system of trial and error. The best model that provided more accurate productivity estimate without being overly complex was structured of Multilayer Perception (MLP) includes one input layer with 16 input neurons and one hidden layer with (22 hidden neurons) and finally three output layer with one output neuron (Labor productivity (M3/day)). However, the main downside to using the Multilayer Perception network structure is that it required the use of more nodes and more training epochs to achieve the desired results. Figure 9 summarizes the architecture of the model as number of hidden layer/nodes, type of network and transfer function. Fig.9: Architecture of the model

Results Analysis
The testing dataset was used for generalization that is to produce better output for unseen examples. Data from 15 cases were used for testing purposes. A Neuro solution test tool was used for testing the adopted model accordingly to the weights adopted. Table 2 present the results of these 15 cases with comparing the real productivity (M3/day) of tested cases with estimated productivity from neural network model, and an absolute error with an absolute percentage error is also presented.  Table 3 equals (0.743 M3/day), it is largely acceptable for Gaza Strip construction industry. However, it is not a significant indicator for the model performance because it proceeds in one direction, where the mentioned error may be very simple if the project is large, and in turn; it may be a large margin of error in case the project is small.  Mean Absolute Percentage Error The mean absolute percentage error of the model is calculated from the test cases as shown in Table 2, which equals 2%; this result can be expressed in another form by accuracy performance (AP) according to Wilmot and Mei, (2005) which is defined as (100−MAPE) %. AP= 100% -2% = 98%. That means the accuracy of adopted model for estimating productivity. It is a good result especially when the construction industry of Gaza Strip is facing a lot of obstacles [12].  Correlation Coefficient (R) Regression analysis was used to ascertain the relationship between the estimated productivity and the actual productivity. The results of linear regressing are illustrated in table 3. The correlation coefficient (R) is 0.997, indicating that; there is a good linear correlation between the actual value and the estimated neural network productivity   Figure 11 describes the actual productivity comparing with estimated productivity for cross validation (C.V) dataset. It is noted that there is a slight difference between two quantities lines.

Sensitivity Analysis
Sensitivity analysis is the method that discovers the cause and effect relationship between input and output variables of the network. The network learning is disabled during this operation so that the network weights are not affected. The basic idea is that the inputs to the network are shifted slightly and the corresponding change in the output is reported either as a percentage or as a raw difference [6]. Table 4 and 5 show the sensitivity analysis of the GFF model which includes 16 graphs Sensitivity analysis was carried out by Neurosolution tool to evaluate the influence of each input parameter to output variable for understanding the significance effect of input parameters on model output. The sensitivity analysis for the best GFF model was performed and the result is summarized and presented in figure 12. Fig.12: Sensitivity about the mean Figure 12 shows "Footings Volume" parameter has the greatest effect on the productivity of foundation works output where its influence exceeds the impact of other factors combined. But the result of (Mady M., 2013) showed that number of labor factor had the greatest effect on labor productivity for casting concrete slabs.
Mady study was consisting of 11 factors which affect labor productivity for casting concrete slabs [13]. The value 8.61 for the footings volume input parameter is the value of the standard deviation for 111 output values. These output values are recorded after training the model with fixing the best weights on a matrix data. All inputs are fixed on the mean value for each raw except the footings volume value which varied between (the meanstandard deviation) to (the mean + standard deviation). The second parameter affecting the total productivity is "Duration of formwork and casting Footings" which has great effect on productivity. While the result shows that parameter "Payment delay" hasn't any effect on productivity of foundation works. This result is unexpected.
VIII. CONCLUSION  Historical data of building projects were collected from the questionnaire. The projects were executed between 2012 and 2016 in Gaza Strip. 111 case studies were divided randomly into three sets as training set (83 projects 75%), cross validation set (15 projects 14%), and testing set (13 projects 11%).  Developing ANN model passed through several steps started with selecting the application to be used in building the model. The Neurosolution5.07 program was selected for its efficiency in several previous researches in addition to its cease of use and extract results. The data sets were encoded and entered into MS excel spreadsheet to start training process for different models.  Many models were built but GFF model was found the best model, which structured from one input layer, included 16 input neurons, and included one hidden layer with 22 neurons.  The accuracy performance of the adopted model recorded 98% where the model performed well and no significant difference was discerned between the estimated output and the actual productivity value.  In order to ensure the validity of the model in estimating the productivity of new projects, many statistical performance measures were conducted i.e; Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Total Mean Absolute Percentage Error (Total MAPE), and Correlation Coefficient (r). The results of these performance measures were acceptable and reliable.  Sensitivity analysis was performed using Neurosolution tool to study the influence of adopted factors on labor productivity. The performed sensitivity analysis was in general logically where the "Footings Volume" had the highest influence, while the unexpected result was "Payment delay" factor which hadn't any effect on productivity of foundation works.