Forecasting of Suspended Sediment in Rivers Using Artificial Neural Networks Approach

Suspended sediment estimation is important to the water resources management and water quality problem. In this article, artificial neural networks (ANN), M5tree (M5T) approaches and statistical approaches such as Multiple Linear Regression (MLR), Sediment Rating Curves (SRC) are used for estimation daily suspended sediment concentration from daily temperature of water and streamflow in river. These daily datas were measured at Iowa station in US. These prediction aproaches are compared to each other according to three statistical criteria, namely, mean square errors (MSE), mean absolute relative error (MAE) and correlation coefficient (R). When the results are compared ANN approach have better forecasts suspended sediment than the other estimation methods.


I. INTRODUCTION
Daily sediment estimation is important to protect of the water resources. Measuring sediment load of rivers is expensive and time consuming. River flows have measured in field stations but there isn't enough measurement of Suspended Sediment. In recent years, sediment estimation studies have been made to develop sediment rating curve (SRC), regression methods and artificial intelligence techniques for simulation processes with limited knowledge of the physics. Usually in most rivers, sediments are mainly transported as suspended sediment load [1]. Many models have been provide to simulate this phenomenon. However traditional sediment rating curves are not able to provide sufficiently accurate results. Sediment rating curves are showed a relation between the sediment and river discharges. Such a relationship is usually established by a regression analysis, and the curves are generally expressed in the form of a power equation. McBean and Nassri [2] examined suspended sediment rating curves and the practice of using sediment load versus discharge is shown to be misleading, since the goodness of fit implied by this relation is spurious.
In recent years, artificial intelligence approaches, based on learning algoritms, methods of artificial neural networks (ANN), adaptive neuro-fuzzy (NF) and support vector machines (SVM) have been widely used to in water resource management and hydrological projects [3,4,5,6,7,8,9,10,11,12]. Mustafa et al. [13] used a multilayer perceptron feed forward neural network with different algorithms to predict the suspended sediment discharge of a river in Peninsular, Malaysia. Demirci and Baltaci [14] investigated the performance of the sediment rating curves (SRC), multiple linear regression (MLR) and fuzzy logic (FL) for suspended sediment prediction. Afan et al. [15] used feed forward neural network and radial basis function methods for sediment estimation. Buyukyildiz and Kumcu [16] researched to viability artificial intelligence techniques to predict of the sediment load which gauged at station in Turkey. They analyzed artificial intelligence methods such as support vector machine (SVM), artificial neural network (ANN) and adaptive neural fuzzy inference system (ANFIS). According to the their model results; SVM, ANN and ANFIS have good results in test phase. Nivesh and Kumar [17] investigated the performance evaluation and validation of artificial neural network (ANN), and regression models for predicting sediment load from the Vamsadhara river basin in south India.

II.
APPROACHES In this paper, SRC, MLR, ANN, M5tree modeling approaches are utilized for forecasting the sediment load to compare their performances in modeling. So as to forecast sediment concentration, the daily streamflow, water temperature and suspended sediment time series data belonging to one station in USA are used.

Sediment Rating Curve (SRC)
A sediment rating curve (SRC) associates suspended sediment concentration in a river with stream discharge. The sediment rating curve (SRC) generally represents a functional relationship of the form S = a Q b (1) in which Q is stream discharge (m 3 /s) and S (mg/l) is either suspended sediment concentration amount. Values of a and b constant data is detected via a linear regression between (log S) and (log Q).

Multiple Linear Regression (MLR)
Multiple linear regression (MLR) tries to determine the relationship between two or more variables and a response variable by fitting a linear equation to the measured real data. If y dependent variable is assumed to be affected by n independent variables such as x1, x2,… xn and a MLR equation is n n 3 3 2 2 1 1 0 In multi linear regression method, b0, b1, b2, b3….bn regression coefficients are statistically determined. the equations for the regression coefficients are given below. Here; x value is the average number of that variable.

M5 Tree (M5T)
M5 approach was introduced by Quinlan [18]. M5 is a system that creates tree-based and segmented linear models. This model involve classification which generate decision trees. Model tree production takes place in these stages: The first stage involves using a partitioning criterion to form a decision tree. The partitioning criterion for the M5 tree approach algorithm is based on the assumption that the standard deviation of the values of a node accessing class is a measure of the error in that node and then constructing a test for each attribute when computing the expected decrease in this error. The formula of standard deviation reduction (Δ) given below: where sd is symbolize of the standard deviation, T is a set of instances that gets at the node, Ti is the subset of instances that have the ith outcome of the potential set. [19]. After all possible tests have been obtained, M5 selects the test which maximizes this expected "error reduction". Readers who want to learn more about the M5 model tree, can examine Quinlan [18].

Artificial Neural Networks (ANN)
Artificial neural networks (ANN) are one of the computing techniques and systems that able to derive new information through learning from the properties of the human brain, ability to create and discover new information, developed with the aim of being able to perform without any help. Artificial neural networks; inspired by the human brain, is the result of mathematical modeling of the learning process. The most widely used method among the ANN methods is the feed-forward-back-propagation ANN approach, which operates according to the principle of back propagation of errors. In this model, an artificial neural network consists of the input layer, the variable weight factors, the total function, the activation function and the output layer and artificial neural network structures with three (input, hidden and output) layers were given in Fig. 1.

Fig. 1. ANN structures with three layers (input, hidden and output layers) used in suspended sediment estimation
According to Fig.1, Wij; Is the connection weights between the input and the hidden layer and Wjk is the connection weights between the hidden layer and the output layer. These Wij and Wjk values are coefficient values that express the effect of the previous input data on the processed element. These coefficients, which initially receive random weight values, change constantly by comparing the actual output values with the outputs estimated in the training process. Errors until they reach their minimum link weight values, errors propagated backwards. Each cell in the hidden and output layers in Fig.1. allows the data from the previous layer to enter the total function (net). This function calculates the net input to the cell and determines the following equation.
In equation (6), N is the size of input vector, bj is the bias term, Wij is the set of weights between i and j layers, Xi is the input set of the i-th layer for the p-th instance. The activation function generates the output f (net) by passing the net value through a nonlinear identification function in each cell of the j and k layers. One of the most commonly used identification functions is Sigmoid function. Sigmoid function is used in this study and is expressed as in equation (7).

International Journal of Advanced Engineering Research and Science (IJAERS)
[

III. APPROACH RESULTS
In this study, it was investigated all viability of approaches at sediment prediction in river. As data, American Geological Research Survey (USGS) measurement data was used. A total of 700 daily field data were used for estimation. In the study, the data is divided into two parts as train and test data. % 70 part of all data are used for training and the remaining part 30% used for the test in the models.

Error Analysis
For each model, statistical parameters such as mean square error (MSE), mean absolute error (MAE), and correlation coefficients (R) between the approach predictions and observations. MSE and MAE parameters were determined as follows. the observed values are calculated. These parameters results are used to compare the performance of approach estimation and the observed values are calculated. MSE and MAE equations were given as :.
Where, N represents number of output used and Yi sediment concentration data in estimation.

Fig. 2. Sediment Rating Curve graph
For the SRC model, the streamflow (Q) were used as input values. The conventional SRC which is formed between streamflow and sediment concentration data, shown in Fig.2. SRC distribution and scatter graphs based on SRC curve results are shown for testing data in Fig. 3. and Fig. 4.

Fig. 3. Measurement and SRC distribution graph for test data
When distribution graph in Fig. 3. for testing data are analyzed, SRC sediment concentration values are seen different for estimated value according to the actual values. The correlation coefficient was obtained as R = 0.5848. Values of sediment rating curve are seen to be spaced out from the actual values.

Multiple Linear Regression (MLR) Results
For Multiple linear regressions (MLR), the average water temperature (Tmean), the streamflow (Q), lagged time the

International Journal of Advanced Engineering Research and Science (IJAERS)
[ Vol-4, Issue-12, Dec-2017]  https://dx.doi.org/10.22161/ijaers.4.12.14  ISSN: 2349-6495(P) | 2456-1908(O) streamflow (Qt-1, at time t-1)and the lagged time sediment concentration (St-1,at time t-1) were used as input values.      Fig. 10. Measurement and ANN scatter graph for test data The correlation coefficient R = 0.8908 was obtained for the graph generated for the test with the ANN approach results. The ANN predictions at the test phase show good results and in this study, ANN predictions slightly better than the MLR and M5T models values for the observed daily real-time sediment concentrations. It is seen that ANN models have low error rates and a high correlation when a general evaluation is carried out.  Fig. 6., Fig. 8. and Fig. 10. provides the scatter plots of the observed and predicted sediment amount during the MLR, M5T and ANN test periods. As seen from Table 1., MLR, M5T and ANN approach has the smallest MSE-MAE and the highest R for four-input combination during the test period. But, ANN approach slightly better than the MLR and M5T models for forecasting of daily real-time sediment concentrations.

IV.
CONCLUSIONS In this study, the abilities of artificial neural networks (ANN), M5Tree (M5T) models and statistical approaches such as Multiple Linear Regression (MLR), Sediment Rating Curves (SRC) methods in estimating the sediment concentration were investigated. Average water temperature, daily real-time flow rate, sediment concentration data in the US were used. When the results are evaluated, MLR, M5T and ANN approach has the smallest MSE-MAE and the highest R. But, ANN approach slightly better than the MLR and M5T models for forecasting of daily real-time sediment concentrations. The worst results in all criteria were obtained in the classical sediment rating curve (SRC) method. Although all present modeling approaches are quite helpful and important in the water resources management studies, but it is shown in this paper that the ANN can be a viable alternative for river sediment prediction in future research. ANN approach applications developed for a specific region can be used as a very useful method for predicting