Tải bản đầy đủ (.pdf) (17 trang)

Machine learning modelling of tensile force in anchored geomembrane liners

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.2 MB, 17 trang )

Geosynthetics International

Machine-learning modelling of tensile force in anchored
geomembrane liners

K. V. N. S. Raviteja1,2, K. V. B. S. Kavya3, R. Senapati4 and K. R. Reddy5

1SIRE Research Fellow, Department of Civil, Materials, and Environmental Engineering, University of
Illinois, Chicago, IL, USA
2Assistant Professor, Department of Civil Engineering, SRM University AP, Amaravati, Guntur, India,
E-mail:
3Research Scholar, Department of Civil Engineering, SRM University AP, Amaravati, Guntur, India,
E-mail:
4Assistant Professor, Department of Computer Science and Engineering, SRM University AP, Amaravati,
Guntur, India, E-mail:
5Professor, Department of Civil, Materials, and Environmental Engineering, University of Illinois,
Chicago, IL, USA, E-mail: (corresponding author)

Received 28 October 2022, accepted 26 February 2023

ABSTRACT: Geomembrane (GM) liners anchored in the trenches of municipal solid waste (MSW)
landfills undergo pull-out failure when the applied tensile stresses exceed the ultimate strength of the
liner. The present study estimates the tensile strength of GM liner against pull-out failure from
anchorage with the help of machine-learning (ML) techniques. Five ML models, namely multilayer
perceptron (MLP), extreme gradient boosting (XGB), support vector regression (SVR), random forest
(RF) and locally weighted regression (LWR) were employed in this work. The effect of anchorage
geometry, soil density and interface friction were studied with regards to the tensile strength of the GM.
In this study, 1520 samples of soil–GM interface friction were used. The ML models were trained and
tested with 90% and 10% of data, respectively. The performance of ML models was statistically
examined using the coefficients of determination (R2, Radj 2 ) and mean square errors (MSE, RMSE). In
addition, an external validation model and K-fold cross-validation techniques were used to check the


models’ performance and accuracy. Among the chosen ML models, MLP was found to be superior in
accurately predicting the tensile strength of GM liner. The developed methodology is useful for tensile
strength estimation and can be beneficially employed in landfill design.

KEYWORDS: Geosynthetics, Anchorage capacity, Machine learning, Geoenvironment, Landfill

REFERENCE: Raviteja, K. V. N. S., Kavya, K. V. B. S., Senapati, R. and Reddy, K. R. (2023).
Machine-learning modelling of tensile force in anchored geomembrane liners. Geosynthetics
International. [ />
1. INTRODUCTION the geosynthetic interface components are highly influ-
enced by the properties of overlying waste (Reddy et al.
Composite liner consisting of compacted clay liner (CCL) 2017). The conventional limit equilibrium analysis lacks
(or geosynthetic clay liner, GCL) and geomembrane the ability to determine displacement along the critical
(GM) is used to prevent leachate from escaping from shear plane and report strain levels within the composite
municipal solid waste (MSW) landfills. GM is placed liner system (Reddy et al. 1996).
over CCL or GCL and overlain by a leachate drainage
layer. An anchor system secures GM to avoid pull-out Anchor systems could be of different geometries
failure. Figure 1 shows the schematic representation of the (simple runout, rectangular, L-shape and V-shape) with
liner and GM anchorage system in MSW landfills. soil backfilled in the trenches (Koerner et al. 1986). GM
Ensuring the stability and integrity of the composite liners are often prone to pull-out failure along the side
liner system is crucial in landfill design. The anchor slopes of the landfill during installation. Figure 2 presents
system secures GM liners in order to avoid pull-out failure the pull-out force and corresponding resistance forces
caused by stresses induced by the drainage layer (Koerner developed along the liner embedded in a V-shaped trench.
et al. 1986; Sharma and Reddy 2004). It is reported that The anchorage capacity should be designed in an optimal
way so that it acts rigid when the mobilised tension is less,
1072-6349 © 2023 Thomas Telford Ltd
1

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.


2 Raviteja, Kavya, Senapati and Reddy

Cover soil

GM liner

Backfill Native soil
V-trench

GM liner Drainage layer

CCL/GCL
Native soil

Figure 1. Schematic representation of GM liner anchored in a V-shaped trench

Cover soil

dcs Drainage layer

f f

f Backfill f f

dat f f Native soil
f

GM liner

Lat Lro Pull-out force,

T (kN/m)

Figure 2. Anchorage showing the mobilised tension and interface frictional resistance acting along the length of GM liner

and flexible when the mobilised tension reaches the developed to address the stability and tension factors for
ultimate tensile strength to avoid tear in the GM liner. It cover soils on GM-lined slopes (Koerner and Hwu 1991).
is important to determine the tensile force (T ) based on all Qian et al. (2002) derived an expression for tensile force in
the system variabilities (soil properties, anchorage geo- the liners for simple, rectangular and V-shaped anchors by
metry and CCL–GM interface shear characteristics) to considering normal stress from cover soil. The anchor
ensure anchorage stability. trench pull-out resistance is analysed and compared for
four design models (Raviteja and Basha 2018). A
A large number of physical tests and evaluations needs significant variability is associated with soil–GM liner
to be conducted on pull-out apparatus and shear box interface friction that needs to be incorporated in the
equipment for the experimental assessment of tensile design of anchor trench (Raviteja and Basha 2015). The
forces in the GM liner. It is recommended to conduct one target reliability-based design optimisation is proposed for
test on the tensile properties of the liner for every a V-shaped anchor trench against pull-out failure (Basha
100 000 ft2 (TCEQ 2017) – that is, a 600-acre landfill and Raviteja 2016). Huang and Bathurst (2009) devel-
site requires more than 3000 conformances testing to oped statistical bilinear and nonlinear models for predict-
determine the tensile properties of the liner. Further, the ing the pull-out capacity of geosynthetics. The cyclic
variability associated with various design parameters of interface shear properties between sandy gravel and
anchorage could demand repetitive testing for proper high-density polyethylene (HDPE) GM were experimen-
judgement (Raviteja and Basha 2021). The friction angle tally evaluated and further modelled through a constitu-
at CCL–GM and sand–GM interfaces are the most tive relationship (Cen et al. 2019). Miyata et al. (2019)
critical parameters in anchor trench design. Most proposed ML regression models to predict the pull-out
pull-out failures on the side slope are initiated at the capacity in steel strip reinforcement. The pull-out coeffi-
soil–GM interface. Recent and past studies have shown cient is determined using analytical techniques that rely on
that low frictional resistance at the interface, tensile soil engineering properties, namely stress-related skin
stiffness of the liner and failure of the soil mass along friction between soil and geosynthetics (Samanta et al.
the preferential slip lines in granular soils are some of the 2022).
major causes. Inadequate analysis of soil–geosynthetic

interface characteristics would result in pull-out failure. In general, artificial intelligence (AI) techniques fre-
quently outperform traditional and deterministic sol-
Koerner et al. (1986) analysed anchorage resistance by utions. AI approaches such as artificial neural networks
determining the pressure exerted by cover and backfill soil (ANN), genetic programming (GP) and support vector
on the GM liner. Further, four design models were

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

Machine-learning modelling of tensile force in anchored geomembrane liners 3

machines (SVM) are more sophisticated, resulting in wide suggested by Koerner (1998) and Qian et al. (2002),
usage for geotechnical engineering designs. Several pull-out failure is preferable to tensile failure of the GM
authors have identified the importance of extensive liner. Basha and Raviteja (2016) reported the following
database analysis in better predicting experimental Equation 1 to calculate the allowable GM tensile force
results. Machine learning (ML)-based applications are against pull-out failure (Ta) based on the Qian et al. (2002)
gaining prominence in geotechnical engineering (Sharma theory considering the GM liner as a continuous member
et al. 2019; Hu and Solanki 2021; Mittal et al. 2021; throughout the length.
Rauter and Tscchnigg 2021; Zhang et al. 2021). Chou
et al. (2015) used an evolutionary metaheuristic intelli- Ta ẳ dcs Lrotan ỵ 2dcs ỵ 0:5datịLattancos 1ị
gence model to estimate the tensile loads in cosα À sinαtanδ
geosynthetic-reinforced soil structures. The applicability
is verified for five different ML models in determining the where γ is unit weight of soil, dcs is depth of cover soil, Lro
peak shear strength of soil–geocomposite drainage layer is runout length, δ is the interface friction angle, ψ is
interfaces (Chao et al. 2021). ANN models successfully trench angle, α is the angle of side slope, Lat and dat are the
estimate anticipated settlement in geosynthetic-reinforced length and depth of anchor trench, respectively.
soil foundations (Raja and Shukla 2021). It is reported
that the pull-out coefficient in geogrids could be accu- 2.2. Multilayer perceptron (MLP)
rately predicted using random forest regression (RFR)

(Pant and Ramana 2022). Ghani et al. (2021) studied the The MLP is a multilayered network with input, output
response of strip footing resting on prestressed layers and hidden units that can represent a variety of
geotextile-reinforced industrial waste using ANN and nonlinear functions. MLP is a part of an artificial neural
extreme ML. The complex heterogeneous nature of the network. Interactions between the inputs and outputs can
soil properties and the peculiar interaction with various be represented using multilayer neural networks. Each
geosynthetic materials can be simulated and well analysed layer consists of neurons linked across different layers with
using ML models. Ghani et al. (2021) studied the response connection weights. The weight of connections is adjusted
of strip footing resting on prestressed geotextile-reinforced based on the output error, which is the difference between
industrial waste using ANN and a revolutionary ML the ideal and predicted output when propagating back-
approach. The complex heterogeneous nature of the soil wards. Backpropagation is the method for updating
properties and the peculiar interaction with various weights in such multilayered neural networks. The
geosynthetic materials can be simulated and well analysed graphical representation of the architecture for MLP is
using ML models. Chao et al. (2023) experimentally shown in Figure 3.
validated the peak shear strength of clay–GM interface
predicted using AI algorithms. For example, a two-layer network with one hidden layer
and one output layer is provided with a requisite number
This paper used five different ML models to build an of hidden units. Using appropriate activation functions
anchorage model for assessing tensile force against can represent any Boolean and continuous functions with
pull-out failure. A dataset has been compiled from intolerance. The algorithm is trained as given in the
published test results that include soil parameters, soil– following steps. Step 1: Initialise the structure of the
liner interface friction angle (δ), side slope angle (α) network as well as weights with small random values at
and allowable tensile force (Ta). ML models were studied different biases in the network. Step 2: Forward comput-
using K-fold cross-validation (CV) and grid search to ing: apply training examples comprising of ((x1, y1),
find hyperparameters for a better prediction of results. (x2, y2) …, (xm, ym)) to the network one by one, where x
A comparative analysis is carried out to determine the (input vector) = {γ, dcs, Lro, δ, ψ, α, Lat, dat}; y (output
superior ML model. vector) = {Ta}. Step 3: Update weights: predicted output
says yˆ = Ta for a particular configuration of the network.

2. METHODOLOGY x1 Input signals y^
x2 Output layer

2.1. Anchored GM liner tensile force against pull-out force x3 z1
xm
The tension mobilised in the anchored GM liner is z2
affected by the friction at the soil–liner interface, over- Inputs
burden pressure from soil cover, liner alignment, trench zn
geometry, construction activities and equipment loads at wh z v h h
the crest portion. A high mobilised tension may pull out
the liner from the anchorage. Nevertheless, a rigid First hidden
anchorage can lead to tearing of the GM liner. Figure 2 layer
shows the GM liner anchorage indicating the resisting ( f ) Error signals
and pull-out (T ) forces. The anchor-holding capacity
should preferably be between the allowable tensile force Figure 3. Architecture of MLP
and the ultimate tensile force of the GM liner to avoid
pull-out failure and tear in the GM liner. However, as

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

4 Raviteja, Kavya, Senapati and Reddy

If there is a difference in y1 and yˆ, the weight vectors will works on the principle of ensembling multiple weak
be adjusted accordingly based on the computation of learners sequentially to form a strong learner. Generally,
error signals to neurons. Step 4: Repeat the process with a weak learner is a small decision tree with few splits
updated weights until the model converges to obtain less wherein each tree learns from errors made by the previous
error between actual output and predicted output. model until there is no improvement. XGB also works on
the same principle with additional regularisation par-
Consider a training dataset of m samples (x1, x2,…, xm). ameters, which improves the model’s accuracy by prevent-
The forward propagation calculation is given in Equations ing overfitting. Figure 4 illustrates the architecture of
2 and 3. Each neuron consists of linear and activation XGB. The current output of the mth tree is the sum of

functions, as shown in Figure 3. Further, the loss is previous tree output and the hypothesis function of the
calculated using the function in Equation 4. current tree multiplied by the regularisation parameter
(Equations 7 and 8).
zh ¼ aðwT Á xiÞ ð2Þ
h

yˆ ¼ vhT zh ð3Þ TmX ị ẳ Tm1X ị ỵ rịmhmX ; rmÀ1Þ 7ị

E ẳ 1 y yˆÞ2 ð4Þ Xm
2 arg min ẳ LẵYi; Ti1Xiị ỵ rhiXi; ri1ị 8ị

where zh indicates activation of hidden layer, a is the αr i¼1
activation function, wh is the weight vector, vhT
is the transpose of weight vector, E is the loss function. where Tm(X ) is the mth tree output, (αr)i is a regularisation
The calculated errors are back propagated, updating the parameter, ri is the computed residuals with ith tree, hi is a
weights wh and vh by propagating the estimated error function trained to predict residuals and L(Y, T(X )) is the
using the descent gradient as given in Equations 5 and 6. differential loss function.
The number of hidden layers and neurons in each layer
will affect the model performance. 2.4. Support vector regression (SVR)

vh ẳ @E 5ị SVM are supervised machine-learning models designed
@vh for classification and further updated to regression
problems (Vapnik 1995). SVM predicts discrete categori-
Δwh ¼ Àη @E ð6Þ cal labels as a classifier, whereas SVR envisages continu-
@wh ous order variables as a regressor (Vapnik 1997).
Although SVR is based on the same processing principle,
where η is the learning factor. it is used to solve regression problems, unlike SVM.
Problem-solving involves constructing a hyperplane that
2.3. Extreme gradient boosting (XGB) separates the positive and negative parts of the data, along
XGB is a supervised ML algorithm proposed by Chen with two decision boundaries that are parallel to the

and Guestrin (2016). Gradient tree boosting is one of the hyperplane. This is known as the insensitive region (ε)
techniques that works efficiently for classification and (ε-tube), which makes the data linearly separable. In SVR,
regression applications. The regular boosting algorithm the algorithm forms the best tube by formulating
optimisation. Nevertheless, SVR balances the complexity
of prediction errors by providing a good approximation.

Data set (X, Y )

Tree 1 Tree 2 Tree m
T1 (X) T2 (X) Tm (X)

í

Compute α1 Compute α2 Compute αi Compute αm
Compute
residuals Compute Compute Compute
(r1) = Y–Ŷ residuals (r2) residuals (ri) residuals (rm)

Figure 4. Process of XGB Output

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

Machine-learning modelling of tensile force in anchored geomembrane liners 5

In this method, the search converges at a hyperplane that +ε
holds maximum training data within the boundaries ξ
(ε-tube). To estimate a linear function, SVR can be
formed as given in Equation 9. –ε


Consider a two-dimensional (2D) training set (x1, Output, y
x2, …, xn) where x is an input variable, y is a target
variable and n is number of variables. The core goal of ξ
SVR is to obtain y. The divergence from the actual output
(ε) can be achieved by minimising the Euclidean form of Input, x
weight vector (w) (Equation 10) subjected to the con-
straints (Equation 11). The algorithm finds a weight Figure 5. Architecture of SVR
vector where most samples are within the margin.
Prediction error that lies outside the margin can be 2.5. Random forest (RF)
decreased by inserting the slack variables ξi and ξi*
which help in converting hard margin to soft. The RF is a bootstrap aggregation-based ensemble machine-
optimisation functions are provided in Equations 12 and learning algorithm defined on decision trees developed by
14 and the corresponding constraints are given in Breiman (2001). This method incorporates randomness
Equations 13 and 15. If the hyperparameter (C) is too during the attribute selection phase replacing the training
high, the model will not allow large slacks. If C = 0, then data. Bagging techniques force the ensemble model to
slack variables are not penalised, so they can be as large as generate a variety of decision trees, where each tree acts on
possible resulting in poor performance of the model. different data subsets. Since the trees are made up of a
random selection of samples and features, they create
y ẳ w xị þ b ð9Þ numerous random trees, forming a random forest. The
RF modelling procedure is depicted in Figure 6. The
Minimise : 1 kwk2 ð10Þ RF method is superior to the decision tree technique. RF
2 has low variance and low bias as this method averages
& the number of decision trees trained on various parts
of the same training data, making it better for prediction.
Constraints : yi À ðw Á xiÞ À b ε ð11Þ
ðw Á xiÞ þ b À yi ε A dataset with ‘m’ samples creates a decision tree from
several bootstrap samples by considering only a random
where b is a dimensionless constant variable. subset of the total of ‘F’ features. HenÀce, D pfeffiaffiffitÁures are
evaluated for each tree at each split D ¼ F . Their

1 2 Xn à correlation is reduced when the trees are trained with
Minimise : k wk ỵC i ỵ i ị ð12Þ a random subset of features. As with bagging, training
2 i¼1 is typically performed for a large number of trees. As
given in Equation 16, the average of all random tree
8 outputs O1, O2, O3, …, On is used as the regression model
< yi w xiị b ỵ i output (Cn).
Constraints : : w xiị ỵ b y i ỵ i 13ị

ξi; ξi ! 0

1 2 Xj à 14ị
Minimisew; bị : k wk ỵC i ỵ i ị

2 i¼1

8 PK À Á
>> kẳ1 wk1 xi;k ỵ b yi ỵ i
<
Constraints : yi kẳ1 PK wk1xi;k ỵ b
>>: ỵ i Cn ¼ 1 Xn Oi ð16Þ

ξi; ξià ! 0 n i¼1

ð15Þ

The model described above is for linear regression 2.6. Locally weighted regression (LWR)
problems. SVR is flexible and performs nonlinear
regression problems by projecting the data in high LWR is exposure to a non-parametric, supervised learn-
dimensional space using kernel methods to avoid com- ing algorithm. As the name indicates, LWR predictions
plexity. In the nonlinear process, SVR adopts the kernel are based on data close to the new instance and the

function (Φ) that represents the nonlinear relationship contribution of each training example is weighted based
between w and x. Figure 5 presents the architecture of on its distance from the new instance. LWR excludes
SVR. Among the various kernel functions, radial basis the training phase, so the entire work is performed during
and polynomial functions are successfully employed for the testing/prediction phase. Further, LWR considers
geotechnical engineering problems (Debnath and Dey the full dataset to make predictions, unlike simple
2018). regression, which is needed to construct the regression
line local to each new data point. Thus, LWR overcomes

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

6 Raviteja, Kavya, Senapati and Reddy

Dataset

Decision tree-1 Decision tree-2 Decision tree-n
Output-1 Output-2 Output-n

Averaging
Final output

Figure 6. Process of RFR

the limitations of linear regression by assigning weights to
training data (Cleveland and Devlin 1998). The weights
are higher for data points close to the new data point being
predicted by the algorithm, as shown in Figure 7. The
method optimises Θ to minimum and modifies the cost
function (Equation 17). The computation of weighting

function (wi) is given in Equation 18. The learning
algorithm (Θ) chooses parameters for better predictions.
This approximation calculates the estimated target value
for the query instance.

Xm À Á x
wi yi À ΘT xi ð17Þ
Figure 7. Architecture of LWR
iẳ1

wi ẳ exixị2=22 ð18Þ

2.7. Grid search hyperparameter optimisation 2.8. K-fold CV

In ML models, hyperparameters must be specified for K-fold CV is a specific type of predictive analytic
adopting a model to the dataset. The general effects of model with a broad framework that can apply various
hyperparameters on a model are frequently recognised but types of models within it. It consists of the following
determining the appropriate hyperparameter and combi- steps. (1) The original data is randomly divided into K
nations of interactive hyperparameters for a given dataset subsamples, which serve as the training data. (2) For
can be complex. Systematic searching of different pre- each fold, models are estimated using K – 1 subsamples,
ferences for model hyperparameters and selecting the with the Kth subsample serving as a validation dataset.
subset that produces the best model on a given dataset is The process is repeated until all subsamples have served
one of the best ways. This is referred to as hyperparameter as validation and the model result can be averaged
optimisation or tuning. The scikit-learn library in Python across the folds. The K-fold CV can also be extended
has this functionality with different optimisation tech- by dividing the original data into a subset that goes
niques that can provide a unique set of high-performing through the K-fold CV process. The rest of the data is
hyperparameters. Random and grid searches are the two split into another subgroup that can be used to
primary and widely used optimisation techniques for evaluate the final model performance. The final data
tuning the optimisation parameters. In this study, grid subset is often termed test data. In contrast, the testing
search is used to generate an optimised model. It considers set is utilised to assess the generalisation error of the

a search space to be a grid of hyperparameter values and finalised model (Zhang et al. 2021). K-fold CV considers
evaluates each position in the grid. Grid search is ideal all three components: training, validation and testing.
for double-checking combinations that have previously Although there is no definite rule for determining the
performed well. values of K, the number of K-folds was set to five in this
study, as suggested by Kohavi (1995) and Wang et al.
(2015).

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

Machine-learning modelling of tensile force in anchored geomembrane liners 7

Table 1. Statistical descriptors for input and output variables

Parameters δ γ Lro dat dcs ψ Lat α Ta

μ 24.01 17.00 1.72 0.30 0.29 40.83 0.78 33.45 14.98
σ 7.05 0.97 0.72 0.11 0.12 18.67 0.40 6.61 10.10
Max 42 18.7 0.5 0.5 0.5 84.28 1.5 44.98 45.95
Min 15.3 0.1 0.1 0.1 7.96 0.1 22 0.51
n 6 1441 1441 1441 1441 1441 1441 1441 1441
1441

250 140
120
200 100

Count 150 Count 80
60

100 40
20
50
0
0 15.5 16.0 16.5 17.0 17.5 18.0 18.5
5 10 15 20 25 30 35 40 γ (kN/m3)

δ (q)

140

120 140
120
100 100

Count 80 Count 80
60
60 40
20
40
0
20 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
dat (m)
0
160
0.5 1.0 1.5 2.0 2.5 3.0 140
120
Lro (m) 100

Count 160 Count 80

140 60
120 40
100 20

80 0
60 10 20 30 40 50 60 70 80
40 ψ (q)
20
140
0
0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 120
dcs (m)
100
140
Count Count 80
120
60
100
40
80
20
60
0
40
25 30 35 40 45
20
α (q)
0
0.2 0.4 0.6 0.8 1.0 1.2 1.4
Lat (m)


Figure 8. Histograms indicating the wide range of variability among input parameters
Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

8 Raviteja, Kavya, Senapati and Reddy

3. DATABASE used. The present study employs MLP (shallow neural
network), XGB (regression/classification), SVR
In ML, the size of samples is a crucial factor in developing (regression/classification), RF (classification/regression)
an effective prediction model. In this study, 1520 samples and LWR (regression). For the chosen ML models, as a
were generated from collated laboratory results of soil– general thumb-rule, 50–100 data points per predictor are
GM interface friction to formulate the models (Raviteja required to build an efficient ML model (Gopaluni 2010;
and Basha 2015). The ranges of anchorage geometric Bujang et al. 2018; Hecht and Zitzmann 2020). It is
parameters were considered in a wide range covering all crucial to evaluate all other influencing variables when
practical possibilities. A dataset of 1441 samples (exclud- determining the optimal amount of data for analysis. The
ing outliers) is compiled with nine variable parameters: present study employed 1441 input data samples for each
soil–liner interface friction angle (δ), unit weight of soil of the nine variables used for the analysis.
(γ), runout length (Lro), depth of anchor trench (dat),
depth of cover soil (dcs), slope angle of trench (ψ), length 3.1. Correlation analysis
of anchor trench (Lat), side slope angle (α) and allowable
GM tensile force (Ta). The statistics of the given The correlation coefficient is determined to understand
parameters are listed in Table 1 with the mean value (μ), the relationship between dataset parameters. Pearson’s
standard deviation (σ) and size of the dataset (n). The correlation coefficient is used in this study to quantify the
extent of variability among the input parameters is shown relationship between each pairwise parameter in terms of
in Figure 8. strength and direction. Correlation coefficient (ρ) values
range from −1 to +1 (−1 ≤ ρ ≤ 1), where +1 indicates
Raviteja and Basha (2018) developed a mathematical strong positive correlation and −1 indicates strong inverse
model to accurately predict anchorage capacity, which is a correlation. The heat map presenting correlation coeffi-

function of friction angle, unit weight of cover soil and cients of chosen parameters is given in Figure 9. The
trench geometry. There are standard ranges for the analysis was useful in identifying the significant governing
variables, such as geometry of trench, depth of soil cover parameters and in the further process of the model
and thickness of GM liner, to be used in MSW contain- algorithm. As is evident from Table 2, soil–liner interface
ment facilities. The anchorage capacity is computed by friction (δ) has a strong influence in governing tensile force
varying these variables within the practical ranges, as against pull-out failure.
shown in Table 1, as well as using 65 measured friction
angle values. These results are then used as data points for 3.2. Examination of outliers
ML modelling. Thus, with the modelled variables and the
65 measured friction angle values, there was a total of Outliers are points that differ from the remaining obser-
1441 data points for ML modelling. vations with extreme values. These outliers can interfere
during the training and cause algorithms to underperform,
In general, deep neural network (DNN) models like resulting in less accurate ML models. Specific to this study,
convolutional neural network (CNN), recurrent neural the box plot was used to clean the data. It is a standardised
network (RNN) and generative adversarial networks method of displaying data distribution, measured using a
(GAN) require a large amount (from hundreds to five-point scale (minimum, 1st quartile (Q1), median, 3rd
thousands) of data to build a successful ML model. quartile (Q3), maximum). It is used to detect outliers and
However, in the present study such DNN models were not

δ 1 0.01 –0.07 –0.032 0.016 –0.05 0.02 0.038 0.58 1.00
0.75
γ 0.01 1 –0.036 0.008 –0.038 –0.012 0.034 –0.027 0.087 0.50
0.25
α –0.07 –0.036 1 –0.068 –0.027 –0.052 –0.03 0.049 0.16 0
–0.25
Lat –0.032 0.008 –0.068 1 –0.0026 –0.031 0.0017 –0.81 0.42 –0.50
dat 0.016 0.52 –0.019 –0.75
–0.038 –0.027 –0.0026 1 0.00055 –0.0026

dcs –0.05 –0.012 –0.052 –0.031 0.00055 1 –0.014 0.017 0.39


Lro 0.02 0.034 –0.03 0.0017 –0.0026 –0.014 1 0.0012 0.27

ψ 0.038 –0.027 0.049 –0.81 0.52 0.017 0.0012 1 –0.34

Ta 0.58 0.087 0.16 0.42 –0.019 0.39 0.27 –0.34 1

δ γ α Lat dat dcs Lro ψ Ta

Figure 9. Heat map representation of correlation coefficient matrix
Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

Machine-learning modelling of tensile force in anchored geomembrane liners 9

Table 2. Correlation discretisation among the variables Box plot
140
ρ Correlation strength Variables
120
<0.1 Very weak γ, dat Value (kN/m)
–0.2 Weak α 100
0.2–0.5 Moderate Lat, dcs, Lro, ψ
0.5–0.7 Strong δ 80

identify the data distribution. Equations 19 and 20 present 60
the computation of maximum and minimum limits.
Figure 10 depicts the box plot with a maximum and 40
minimum limit of 45.95 and 0.51, respectively. A total of 79
outliers were identified in the dataset and eliminated before 20

training the model.
0
Lịmaxẳ Q3 ỵ 1:5IQRị ð19Þ 1

Lịminẳ Q3 1:5IQRị ð20Þ TaQ

where IQR is the inter-quartile range = (Q3 − Q1) Figure 10. Box plot showing the data pattern for allowable tensile
force (Ta)
4. IMPLEMENTATION OF DIFFERENT
MODELS randomly split at a ratio of 90 : 10 for training and testing,
respectively. For optimisation and better model perform-
In the present study, five different ML models are ance, five-fold CV was performed on a 90% training set
employed: (1) MLP: a neural network approach to wherein four folds were utilised for training and the
predict output features; (2) XGBoost: a boosting tech- remaining one for validation. Prediction on the testing set
nique that uses various decision trees trained sequentially; is completed by averaging the result obtained from the five
(3) RF: a bagging technique that uses different decision folds for each model. Figure 11 illustrates this framework
trees trained in parallel on sub-sample data; (4) SVR: a development.
technique which uses kernel trick to fit the nonlinear
data and (5) LWR: a technique which fits the model The MLP algorithm was trained on the data using the
using locally moving weighted regression based on the Scikit-learn package in Python. Optimisation of an
original data. The chosen ML models follow different algorithm was performed by fine-tuning the hyperpara-
strategies in prediction and are suitable for solving meters using grid search. Critical parameters in MLP are
regression problems. All the analyses are performed the number of layers (chosen between 1 and 4), the number
using the Python programming language (v3.8.11). of hidden units (50 to 300) and the activation function
Pandas and NumPy libraries are used to process and (rectified linear unit, ReLu). Training an MLP with 210
analyse the data, Sklearn is used to code different more layers and hidden units may degrade the performance
algorithms and matplotlib library is used to visualise the by overfitting it as the available data is less in the present
results. The dataset consisting of 1441 data points is work. The XGB algorithm was trained using the XGBoost
package in Python. As XGBoost is based on sequentially
ensembling multiple weak learners (estimators), increasing

the number of estimators may overfit the data. Parameters
such as the number of trees (ranging from 300–600),
learning rate, regularisation parameter, maximum depth
and minimum child weight are fine-tuned. The regularis-
ation parameter is critical in determining how much

Dataset 10% for 5-folds
testing

90% for
training

Training folds Validation fold

Models 1st iteration (m1) 5
2nd iteration (m2)
Grid 3rd iteration (m3) m = (1/5) mi
search 4th iteration (m4)
5th iteration (m5) i = 1
Figure 11. Framework for development and operation of ML models
Geosynthetics International Predicted
values

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

10 Raviteja, Kavya, Senapati and Reddy

weightage each estimator should pay attention to for the R2 À Ro2  2 ′2
final prediction. In SVR, a hyperplane with a maximum R À Ro
number of points is the best line. It fits nonlinear equations R2 0:1 ðorÞ R2 0:1 ð26Þ

using the kernel trick. The critical hyperparameters are
kernel (‘rbf’,‘poly’, etc.), C (regularisation parameter) and Rs2 ! 0:5 ð27Þ
G (specifies the epsilon-tube). Parameter C is critical in the
model performance, where a large C value leads to a small The values of k and k′ can be calculated by using
margin and vice versa. The RF algorithm was constructed
using the scikit-learn module. Optimisation of the algor- Equations 28 and 29. The correlation coefficients that
ithm was performed by fine-tuning the hyperparameters
such as the number of trees, number of samples required to pass-through origin can be obtained from Equations
split a node, depth of a tree and minimum samples required
for the leaf node. RF was trained by varying the number of 30–33. ð28Þ
trees from 300 to 700 and depth from 1 to 6. The minimum i¼1 Pn y p;iyo;i
sample is at the leaf node ranging from 5–10. It is observed
that the model tends to overfit and stabilise at a certain k ¼ Pn 2
point with an increase in the number of trees in RF. i¼1 y p;i
The LWR algorithm was trained using the scikit-lego
module. It has been used for smoothing and can be used i¼1 Pn y p;iyo;i ð29Þ
in ML applications for interpolating data (Debnath and k′ ¼ Pn 2
Dey 2018). The critical hyperparameters are sigma and
span, used to smoothen the curve. i¼1 yo;i

2 i¼1 Pn y p2 ;ið1 À kÞ2 ð30Þ
i¼1 Ro ¼ 1 À Pn ðy p;i À y p;σÞ2

′2 i¼1 Pn yo2;ið1 À kÞ2 ð31Þ
i¼1 Ro ¼ 1 À Pn ðyo;i À yo;σÞ2
4.1. Metrics of performance

The following performance indicators were used to assess  qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

prediction outputs of ML models. Equation 21 is used to R2s ¼ R2 1 À R2 À R2  ð32Þ


compute the model’s RMSE (root mean square error). An o

RMSE near zero indicates a lower prediction error. The Unlike the coefficient of determination (R2) which
coefficient of determination (R2) is calculated as given in increases with the addition of variables, the adjusted
Equation 22. The closer R2 is to 1, the better the model coefficient (Radj 2 ) increases only when a significant vari-
able that contributes to the performance of model is
fits for data. The computation of mean absolute percen- added. Thereby, indicates the most influential parameters
that govern the design.
tage error (MAPE) is shown in Equation 23. A MAPE
2 À1 À R2ÁðN À 1Þ
near zero indicates a high degree of predictability. Radj ¼ 1 À N À p À 1 ð33Þ

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi where N is total sample size and p is number of
1 Xm 2 independent variables.
RMSE ¼ ðyi À yˆiÞ ð21Þ
m i¼1

2 P yi yiị2
R ẳ1 P ð22Þ
ðyi À yˉÞ 2

MAPE ¼ 100% Xm y i À yˆi ð23Þ 5. RESULTS
m i¼1 yi
The performance of various models was assessed based on
4.2. Predictive analysis by external validation three popular indicators: RMSE, MAPE and coefficient
of determination (R2). The mathematical expressions for
An external validation method that analyses model performance indicators are given in Equations 21–23. The
predictability using test dataset performance is generated calculations were performed using actual and predicted
(Golbraikh and Tropsha 2002). The evaluation of predic- outputs of different models based on training, validation

tion accuracy involves a comparison of observed values and test datasets. A model can be considered as best fit if
and predictive values of an external test set that is not used the performance is consistent across all the three evalu-
in model development (Kubinyi et al. 1998; Zefirov and ation indicators. The model can be considered as overfit if
Palyulin 2001). The model must satisfy the four criteria it performs well on the training set but fails to predict the
mentioned below in Equations 24–27 to be considered test set, while the model is considered as underfit if it does
acceptable. The four conditions considered are external not perform well on the training set but performs well on
validation criteria (C1, C2, C3 and C4). the test set. The training, validation and testing errors for
all models across all the performance indicators are
R2 ! 0:6 ð24Þ summarised in Table 3. Insufficient data points result in
overfitting of the ML model. In such cases, the model
0:85 k 1:15ðorÞ0:85 k′ 1:15 ð25Þ becomes too specialised to the training data and does not

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

Machine-learning modelling of tensile force in anchored geomembrane liners 11

generalise well to new data. It can be noted that none of based on the scores and corresponding ranks, as summarised
the models used in the present study reported overfitting in Table 4. All the models were assigned ranks based on the
or underfitting of data. The three performance metrics scores obtained for training, validation and testing. It can be
RMSE, MAPE and R2 indicate that the quantity of data noted that a lower score corresponds to a lower rank,
points is sufficient for the chosen ML models. indicating the better performance of the model.

It is observed that MLP outperformed the other models A comparative analysis has been made between actual and
with minimum error across training, validation and testing predicted values to verify suitability of the model. Figure 13
datasets. The RMSE values were 0.29, 0.41 and 0.42, presents a comparison of the models used in this study. MLP
respectively, for training, validation and testing sets. The fits the data more accurately than all the other models, with
R2 obtained as 0.99 for all three datasets indicates that the minimal error. It is observed that models such as XGBoost
model was the best fit for the data. XGBoost and SVR are and SVR also accurately fit the data with lower error. The

also proven to be suitable ML models for the prediction of external validation scores of ML models are summarised in
anchorage tensile force. However, their accuracy is less than Table 5. It can be noted that among the four external
MLP, with R2 values of 0.98 and 0.97, respectively, in all three validation criterion (see Equations 24–27), at least two must
datasets. Interestingly, the performance of SVR is better than be fulfilled to assert the precise predictions (Golbraikh and
XGB in the case of testing, whereas the performance of XGB Tropsha 2000). With the exception of LWR which did not
is better than SVR in training and validation. Further, it was fulfill the C3 and C4 criteria , all the models in the present
evident from the grid search that the RMSE values are study fulfilled the external validation criteria. Nevertheless,
consistent for all three datasets, indicating model suitability the LWR model predicted the Ta values with considerable
for the selected hyperparameters. Nevertheless, the RF accuracy.
(bootstrap aggregation) algorithm is proven suitable for
classification and regression. Alhough the RMSE error is The main purpose of any ML model is to be trained
marginally higher than the other three models, the perform- from the given data in order to understand the relation-
ance is consistent in training, validation and testing datasets. ships within the provided dataset. In this study, a linear
This could be attributed to the decision-making in the RF regression model is formulated to determine the linear
model, which is based on multiple trees formed on relationship within the given data. An optimal regression
sub-samples of the data. LWR requires fairly large and model was chosen in such a way that it is neither biased
densely populated data to produce acceptable results as it nor shows high variance. In general, ML models have
depends on local data structure while performing local good prediction accuracy if provided with adequate
fitting. The dataset is limited in this study; the LWR population data and proper tuning of hyperparameters.
algorithm cannot fit the data properly, resulting in an The accuracy of the assessment of GM tensile force (Ta)
overfit with considerable errors compared to other models. can be understood from the test dataset given in Figure 14.
All the values of performance indicators were populated by It can be noted that the developed MLP model has shown
averaging the errors obtained from different folds for each high accuracy with the minimum error between actual and
model. The influence of model performance based on data predicted data. Each neuron in a hidden layer is connected
quality at each fold and the corresponding RMSE can be to all other neurons in the successive hidden layer. As
observed from Figure 12. The best fit ML model is identified MLP can detect complex nonlinear relations between
independent and dependent variables, it could have

Table 3. Statistical descriptors indicating the performance of training validation and testing datasets


Training

Model RMSE MAPE R2 Radj 2 Rank Score Total rank

MSE MAPE R2 3 1
7 2
MLP 0.293 2.817 0.999 0.999 1 1 1 8 3
12 4
XGB 1.356 8.891 0.982 0.982 2 3 2 15 5

SVR 1.509 4.289 0.978 0.978 3 2 3 3 1
7 2
RF 2.417 14.537 0.943 0.942 4 4 4 8 3
12 4
LWR 3.959 29.313 0.847 0.845 5 5 5 15 5

Validation 3 1
9 3
MLP 0.417 3.540 0.998 0.998 1 1 1 6 2
12 4
XGB 1.280 9.143 0.984 0.983 2 3 2 15 5

SVR 1.519 4.311 0.977 0.976 3 2 3

RF 2.475 14.586 0.939 0.937 4 4 4

LWR 5.318 41.139 0.722 0.714 5 5 5

Testing


MLP 0.425 3.876 0.998 0.998 1 1 1

XGB 1.365 10.361 0.981 0.979 3 3 3

SVR 1.244 4.345 0.984 0.983 2 2 2

RF 2.458 17.765 0.937 0.934 4 4 4

LWR 5.018 43.55 0.739 0.724 5 5 5

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

12 Raviteja, Kavya, Senapati and Reddy

6 Actual
Predicted
40
5 30
Ta 20
MLP 10

4 XGBoost 0

RMSE error SVR

RF

LWR


3

2

1 0 20 40 60 80 100 120 140

Number of samples

Fold-1 Fold-2 Fold-3 Fold-4 Fold-5 (a)
(a) Fold-4 Fold-5
6 40 Actual
Fold-3 30 Predicted
(b) Ta 20
5 10

MLP 0

4 XGBoost

RMSE error SVR

RF

LWR

3

2 0 20 40 60 80 100 120 140


1 Number of samples

(b)

Fold-1 Fold-2 Actual
Predicted
40
5 30
Ta 20
4 MLP 10

XGBoost 0

RMSE error SVR

3 RF

LWR

2 0 20 40 60 80 100 120 140

Number of samples

1 (c)

40 Actual
30 Predicted
Ta 20
Fold-1 Fold-2 Fold-3 Fold-4 Fold-5 10
(c)

0
Figure 12. Variation of RMSE with 5-fold CV for: (a) training;
(b) validation; (c) testing datasets

Table 4. Rankings and scores for training, validation and testing 0 20 40 60 80 100 120 140
datasets of ML models
Number of samples

(d)

Model Training Validation Testing Total Rank 40 Actual
score score score score 30 Predicted
MLP 1 Ta 20
XGB 3 3 3 9 3 10
SVR 7 7 9 23 2
RF 8 8 6 22 4 0
LWR 12 12 12 36 5
15 15 15 45

0 20 40 60 80 100 120 140

predicted more accurately compared to other ML models. Number of samples
The error between actual and predicted data is significant
for RF and LWR. The deviation of predicted values from (e)
actual at the tails in the case of RF and LWR can be
attributed to the high RMSE. Figure 13. Scatter plot showing the comparison of actual and
predicted values of Ta for (a) MLP, (b) XGB, (c) SVR, (d) RF and
(e) LWR models

5.1. Sensitivity analysis have significant implications at different phases of the model

developed in error finding, model advising, parameter
The uncertainty associated with output and its relationship adjustment and examining the input–output relations. In
with the uncertainty among input variables is studied from the present study, sensitivity analysis is carried out for all the
the sensitivity analysis. Outcomes of the sensitivity analysis

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

Machine-learning modelling of tensile force in anchored geomembrane liners 13

Table 5. External validation of ML models

Model R2 k k′ Ro 2 Ro′2 Rs2 C1 C2 C3 C4 Level

MLP 0.998 0.998 1.001 0.999 0.999 0.955 Y Y Y Y 4
XGB
SVR 0.981 1.004 0.992 0.999 0.999 0.844 Y Y Y Y 4
RF
LWR 0.984 1.034 0.963 0.996 0.996 0.859 Y Y Y Y 4

0.937 1.018 0.965 0.995 0.996 0.711 Y Y Y Y 4

0.739 1.072 0.857 0.963 0.946 0.389 Y Y N N 2

MLP XGB SVR
40 40 40

Predicted Ta 30 Predicted Ta 30 Predicted Ta 30


20 20 20

10 Actual 10 Actual 10 Actual
0 Predicted 0 Predicted Predicted
0 0 0
10 20 30 40 10 20 30 40 0 10 20 30 40
40 40
Actual Ta Actual Ta Actual Ta

RF LWR

Predicted Ta 30 Predicted Ta 30

20 20

10 Actual 10 Actual
Predicted Predicted
0 0
0 10 20 30 40 0 10 20 30 40

Actual Ta Actual Ta

Figure 14. Test set comparison of actual and predicted values of Ta for MLP, XGB, SVR, RF and LWR

variables that could influence GM tensile force. Figures 15 values. The unit weight (γ) of cover/backfill soil has a
(a)–15(g) present the influence of various soil properties trivial influence on Ta, as shown in Figure 15(b).
and trench geometry on the actual and predicted values of Although the unit weight of soil exerts normal stress on
Ta for various ML models. It can be noted that Figure 15 the liner, a major portion of the mobilised tensile forces is
reports the effect of interface friction angle and unit weight resisted by the interface friction that is developed
of soil on Ta. throughout the length of the GM liner. It can also be

noted that the unit weight of soil has a very weak
It is observed that the interface friction angle has more correlation with Ta by a coefficient of 0.087.
influence on Ta compared to all other soil properties and
geometrical parameters as well. As shown in Figure 15(a), The influence of geometrical parameters, namely angle
an overall variation of 30 kN/m can be observed in the of side slope (α), depth of cover soil (dcs), runout length
values of Ta with a change in δ up to 30°. This can be (Lro), trench length (Lat) and trench depth (dat) on the
attributed to the strong positive correlation of 0.58 allowable GM tensile force (Ta) is presented in Figures
between δ and Ta. The interface friction would be 15(c)–15(f). A close association with the reference line can
mobilised between the GM liner and soil even at low be observed for MLP, XGB and SVR models. This can be
magnitudes of applied tensile force. The displacement or attributed to the moderate correlation of Lro, dcs and Lat
pull-out failure in the liner occurs after full mobilisation of with Ta by coefficients of 0.27, 0.39 and 0.42 respectively.
interface friction (Raviteja and Basha 2018). It can be It can be noted that trench length has more influence on
observed that except for RF and SVR, all the other the GM tensile force than other geometrical parameters of
models can accurately predict Ta. However, MLP is the anchorage. As higher trench length contributes to a
superior, with a minimum deviation from the true greater portion of GM in contact with backfill soil, the

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

14 Raviteja, Kavya, Senapati and Reddy

δ γ

30

MLP 22 MLP

25 XGB XGB


SVR SVR

RF 20 RF

20 LWR LWR

Predictions
Predictions
18
15

10 16

5 14

0 12

0 5 10 15 20 25 12.5 13.0 13.5 14.0 14.5
True values
True values (b)

(a)

α dcs
30
25 MLP MLP
XGB
XGB 50 SVR
RF
SVR LWR


RF

LWR

Predictions 40
Predictions
20
30

15

20
10

10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 20 30 40 50
True values
(c) True values

(d)

Lro Lat
30

26 MLP MLP

XGB XGB

24 SVR 25 SVR


RF RF

22 LWR LWR

Predictions 20
Predictions
20

18 15

16

14 10

12 5
6
12 14 16 18 20 8 10 12 14 16 18 20
True values
True values (f)

(e)

dat
26

MLP

XGB

24 SVR


RF

Predictions 22 LWR

20

18

16

15 16 17 18 19 20 21
True values
(g)

Figure 15. Influence of (a) side slope angle, (b) trench depth, (c) depth of cover soil, (d) unit weight of soil, (e) interface friction, (f) trench
length and (g) runout length on the actual and predicted values of Ta for various ML models

Geosynthetics International
Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

Machine-learning modelling of tensile force in anchored geomembrane liners 15

interface frictional resistance against pull-out failure will uses the optimal hyperparameters to attain the best
be higher. In addition, with higher trench length, the GM prediction accuracy.
liner is subjected to normal and lateral stresses for greater (b) Three performance metrics, namely RMSE, MAPE
length from backfill soil, resulting in higher resistance and R2 were used to evaluate the performance of the
against liner displacement. It can be noted that a V-shaped five models. The performance of MLP was superior
anchorage was chosen for analysis in the present study. In compared to the four other models, having
the case of simple runout and rectangular trenches, Lro RMSE = 0.29 and R2 = 0.99 for training data,

would be the more influencing geometrical parameter RMSE = 0.42 and R2 = 0.998 for validation
wherein the trench length has no effect on Ta. Irrespective and RMSE = 0.43 and R2 = 0.99 for testing datasets.
of the anchorage shape, dcs always has a constant impact This shows the model was neither overfitting nor
on GM tensile force. underfitting the data. This is because of minimum
difference in metrics values for training, validation
Figure 15(c) presents the influence of side slope angle (α) and testing. Also, all four external validation
on GM tensile force. Although α is weakly correlated conditions were satisfied by MLP, which implies that
with Ta by a coefficient of 0.16, it is a significant the predictions made by MLP are accurate.
statistical parameter in the anchorage geometry. The weak (c) On the other hand, based on their performance
correlation indicates the presence of other, more important metrics, it is reasonable to conclude that XGB and
geometrical parameters. As shown in Figure 15(g), the SVR can make precise estimates. These two models’
depth of anchor trench (dat) is inversely correlated with Ta training and testing error differences satisfy the four
by a coefficient of −0.019. Further, the trench slope (ψ) external validation criteria. However, the RMSE
is also inversely correlated with Ta by a coefficient of −0.34 values were slightly higher than the RMSE values of
(see Figure 9). From the correlations, it is evident that MLP. Lower RMSE values will be preferred when
steeper slopeswith higher trench depths contribute to greater predicting the tensile force of a GM liner.
tensile force and eventually lead to pull-out failure in the (d ) RF and LWR were less accurate as RMSE values
GM liner. (RMSE = 2.46 and R2 = 0.937 for RF and
RMSE = 5.02 and R2 = 0.74 for LWR) for testing
It can be concluded that for all the parameters reported in were higher than the other models when comparing
Figure 15, MLP is found to be the most suitable model with training and validation metrics. The RF model is
high accuracy, while LWR is found to be the least suitable neither overfitting nor underfitting, while the LWR
model with a maximum difference between true and model results (R2 = 0.847 for training and R2 = 0.739
predicted values. For all the parameters, the SVR model for testing) demonstrate that testing error is more
reported adequately reasonable predictions after MLP. The than training error, implying that the model is
XGB predictions are erroneous for all the parameters, except overfitting.
for δ and dcs. The error associated with RFand LWR models
is considerably high for all the parameters. The results show ACKNOWLEDGEMENT
that MLP is the most suitable model among all the five ML
models employed in this study.


6. CONCLUSIONS The first author acknowledges the financial support
provided by the Science and Engineering Research
This paper reports the application of MLP, SVR, XGB, Board (SERB), Govt. of India, in the form of SERB
RF and LWR ML techniques for the prediction of tensile International Research Experience Fellowship (Award
force in anchored GM liners. The performance of all ML No: SIR/2022/000374).
models was compared with a combination database to
find the best model for the prediction of tensile force in NOTATION
GM liners. The existing mathematical predictive model
requires unit weight and friction angle to be determined Basic SI units are given in parentheses.
experimentally. The use of ML techniques minimises the
amount of physical testing and thereby saves time and a activation function (dimensionless)
human effort. Further, application of ML allows more b constant variable (dimensionless)
complex relationships between the input variables and the C hyperparameter (dimensionless)
tensile force in liners to be explored than could be Cn random forest regression model output
captured by the existing mathematical model. The use of
ML techniques is a valuable contribution to the field and (dimensionless)
highlights the importance of incorporating both exper- D random subset of features (dimensionless)
imental data and validated mathematical model predic- dat depth of anchor trench (m)
tions in engineering research. The following conclusions dcs depth of cover soil (m)
can be drawn from the present study. E loss function (dimensionless)
F number of features in random forest
(a) The hyperparameters of the ML model were tuned
using the grid search hyperparameter optimisation (dimensionless)
technique. This statistic ensures that the algorithm f resisting forces along liner (N/m)
hi trained function to residual prediction

(dimensionless)

Geosynthetics International


Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

16 Raviteja, Kavya, Senapati and Reddy

K constant (varies) σ standard deviation of samples (unit of
k slope of regression line for observed data variable)
(varies)
k′ slope of regression line for predicted data τ bandwidth parameter (unit of variable)
(varies) Φ kernel function (dimensionless)
L(Y, T(X )) differential loss function (dimensionless) ϕ angle of internal friction (°)
Lat length of anchor trench (m) ψ slope angle of trench (°)
upper limit (units of variables)
Lmax lower limit (units of variables) ABBREVIATIONS
Lmin length of runout (m)
mean square error (dimensionless) AI artificial intelligence
Lro total number of samples (count) ANN artificial neural networks
MSE total sample size (count) C1–C4 external validation criteria
number of outputs in random forest CCL compacted clay liner
m (count) cross-validation
N output of random trees in random forest CV geosynthetic clay liner
n (varies) GCL geomembrane
number of independent variables (count) GM genetic programming
Oi 1st and 3rd quartiles, respectively inter-quartile range (dimensionless)
(dimensionless) GP locally weighted regression
p root mean square error (dimensionless) IQR mean absolute percentage error (dimensionless)
Q1, Q3 coefficient of determination (dimensionless) LWR machine-learning
adjusted coefficient of determination MAPE multilayer perceptron
RMSE (dimensionless) ML municipal solid waste
R2 coefficient of determination for observed MLP random forest

values (dimensionless) MSW random forest regression
Radj 2 coefficient of determination for predicted RF support vector machines
values (dimensionless) RFR support vector regression
Ro2 stabilisation criterion (dimensionless) SVM extreme gradient boosting
computed residuals with ith tree (count) SVR
′2 tensile force against pull-out failure (N/m) XGB
allowable tensile force against pull-out
Ro failure (N/m) REFERENCES
mth tree output (dimensionless)
Rs2 weight vector for output layer (unit of Basha, B. M. & Raviteja, K. V. N. S. (2016). Optimum tensile strength of
ri variable) geomembrane liner for V-shaped anchor trenches using target
T transpose of weight vector for output layer reliability approach. Geotechnical and Geological Engineering, 34,
Ta (unit of variable) No. 6, 1995–2018.
Euclidean form of weight vector (varies)
Tm(X ) weight vector for activation of hidden Breiman, L. (2001). Random forests. Machine Learning, 45, No. 1, 5–32.
vh layer (varies) Bujang, M. A., Sa’at, N., Ikhwan, T. M., Sidik, A. B. & Joo, L. C. (2018).
transpose of weight vector for hidden layer
vhT (varies) Sample size guidelines for logistic regression from observational
weighting function (dimensionless) studies with large population: emphasis on the accuracy between
w input and output vectors (varies) statistics and parameters based on real life clinical data. The
wh mean of actual output (unit of variables) Malaysian Journal of Medical Sciences, 25, No. 4, 122–130.
predicted output (varies) Cen, W., Bauer, E., Wen, L., Wang, H. & Sun, Y. (2019). Experimental
whT activation of hidden layer (dimensionless) investigations and constitutive modeling of cyclic interface shearing
side slope angle (°) between HDPE geomembrane and sandy gravel. Geotextiles and
wi regularisation parameter (dimensionless) Geomembranes, 47, No. 2, 269–279.
x, y unit weight of soil (N/m3) Chao, Z., Fowmes, G. & Dassanayake, S. (2021). Comparative study of
friction angle at soil–liner interface (°) hybrid artificial intelligence approaches for predicting peak shear
yˉ insensitive region (dimensionless) strength along soil–geocomposite drainage layer interfaces.
yˆ learning factor (dimensionless) International Journal of Geosynthetics and Ground Engineering, 7,
zh parameter of learning algorithm No. 3, 1–19.

α (dimensionless) Chao, Z., Shi, D., Fowmes, G., Xu, X., Yue, W., Cui, P., Hu, T. &
αr mean of samples (units of variable) Yang, C. (2023). Artificial intelligence algorithms for predicting
γ slack variables (unit of variable) peak shear strength of clayey soil–geomembrane interfaces and
δ Pearson’s correlation coefficient experimental validation. Geotextiles and Geomembranes, 51, No. 1,
ε (dimensionless) 179–198.
η Chen, T. & Guestrin, C. (2016). XGBoost: a scalable tree boosting
Θ system. In 22nd ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, ACM digital library, CA,
μ USA, pp. 785–794.
ξ, ξi* Chou, J. S., Yang, K. H., Pampang, J. P. & Pham, A. D. (2015).
Evolutionary metaheuristic intelligence to simulate tensile loads in
ρ reinforcement for geosynthetic-reinforced soil structures. Computers
and Geotechnics, 66, No. 1, 1–15.

Geosynthetics International

Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.

Machine-learning modelling of tensile force in anchored geomembrane liners 17

Cleveland, W. S. & Devlin, S. J. (1998). Locally weighted regression: an Qian, X., Koerner, R. & Gray, D. (2002). Geotechnical Aspects of
approach to regression analysis by local fitting. Journal of the Landfill Design and Construction. Prentice Hall, Upper Saddle
American Statistical Association, 83, No. 403, 596–610. River, NJ, USA.

Debnath, P. & Dey, A. K. (2018). Prediction of bearing capacity of Raja, M. N. A. & Shukla, S. K. (2021). Multivariate adaptive regression
geogrid-reinforced stone columns using support vector regression. splines model for reinforced soil foundations. Geosynthetics
International Journal of Geomechanics, 18, No. 2, 04017147. International, 28, No. 4, 368–390.

Ghani, S., Kumari, S., Choudhary, A. & Jha, J. (2021). Experimental Rauter, S. & Tscchnigg, F. (2021). CPT data interpretation
and computational response of strip footing resting on prestressed employing different machine learning techniques. Geosciences, 11,

geotextile-reinforced industrial waste. Innovative Infrastructure No. 7, 265.
Solutions, 6, No. 2, 1–15.
Raviteja, K. V. N. S. & Basha, B. M. (2015). Variability associated with
Golbraikh, A. & Tropsha, A. (2000). Predictive QSAR modeling based interface friction between geomembrane and soil. In 50th Indian
on diversity sampling of experimental datasets for the training and Geotechnical Conference, IGC, Pune, India.
test set selection. Molecular Diversity, 5, No. 4, 231–243.
Raviteja, K. V. N. S. & Basha, B. M. (2018). Optimal reliability based
Golbraikh, A. & Tropsha, A. (2002). Beware of q2! Journal of Molecular design of V-shaped anchor trenches for MSW landfills.
Graphics and Modelling, 20, No. 4, 269–276. Geosynthetics International, 25, No. 2, 200–214.

Gopaluni, R. B. (2010). Nonlinear system identification under missing Raviteja, K. V. N. S. & Basha, B. M. (2021). Characterization of
observations: the case of unknown model structure. Journal of variability of unit weight and shear parameters of municipal solid
Process Control, 20, No. 3, 314–324. waste. Journal of Hazardous, Toxic, and Radioactive Waste, 25,
No. 2, 04020077.
Hecht, M. & Zitzmann, S. (2020). Sample size recommendations for
continuous-time models: compensating shorter time series with Reddy, K. R., Kosgi, S. & Motan, E. S. (1996). Interface shear behaviour
larger numbers of persons and vice versa. Structural Equation of landfill composite liner systems: a finite element analysis.
Modeling: A Multidisciplinary Journal, 28, No. 2, 229–236. Geosynthetics International, 3, No. 2, 247–275.

Hu, X. & Solanki, P. (2021). Predicting resilient modulus of cementi- Reddy, K. R., Kumar, G. & Giri, R. K. (2017). Influence of dynamic
tiously stabilized subgrade soils using neural network, support coupled hydro-bio-mechanical processes on response of municipal
vector machine, and Gaussian process regression. International solid waste and liner system in bioreactor landfills. Waste
Journal of Geomechanics, 21, No. 6, 04021073. Management, 63, 143–160.

Huang, B. & Bathurst, R. J. (2009). Evaluation of soil–geogrid pullout Samanta, M., Bhowmik, R. & Khanderi, H. (2022). Laboratory
models using a statistical approach. Geotechnical Testing Journal, evaluation of dynamic shear response of sand–geomembrane
32, No. 6, 489–504. interface. Geosynthetics International, 29, No. 1, 99–112.

Koerner, R. (1998). Designing with Geosynthetics. Prentice Hall, Upper Sharma, H. D. & Reddy, K. R. (2004). Geoenvironmental Engineering:
Saddle River, NJ, USA. Site Remediation, Waste Containment, and Emerging Waste

Management Technologies. John Wiley & Sons, Hoboken, NJ,
Koerner, R. M. & Hwu, B. L. (1991). Stability and tension consider- USA.
ations regarding cover soils on geomembrane lined slopes.
Geotextiles and Geomembranes, 10, No. 4, 335–355. Sharma, S., Venkateswarlu, H. & Hegde, A. (2019). Application of
machine learning techniques for predicting the dynamic response of
Koerner, R., Martin, J. & Koerner, G. (1986). Shear strength parameters geogrid reinforced foundation beds. Geotechnical and Geological
between geomembranes and cohesive soils. Geotextiles and Engineering, 37, No. 6, 4845–4864.
Geomembranes, 4, No. 1, 21–30.
TCEQ (Texas Commission on Environmental Quality) (2017). Guidance
Kohavi, R. (1995). A study of cross-validation and bootstrap for for Liner Construction and Testing for a Municipal Solid Waste
accuracy estimation and model selection. In International Joint Landfill, RG-534. TCEQ, Austin, TX, USA.
Conferences on Artificial Intelligence, ACM digital library,
Montreal, QC, Canada, vol. 14, pp. 1137–1145. Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer
Science & Business Media.
Kubinyi, H., Hamprecht, F. A. & Mietzner, T. (1998). Three-dimensional
quantitative similarity-activity relationships (3D QSiAR) from Vapnik, V. N. (1997). The support vector method. In International
SEAL similarity matrices. Journal of Medicinal Chemistry, 41, Conference on Artificial Neural Networks, Springer, pp. 261–271.
No. 14, 2553–2564.
Wang, Y., Li, J. & Li, Y. (2015). Measure for data partitioning in m × 2
Mittal, M., Satapathy, S. C., Pal, V., Agarwal, B., Goyal, L. M. & cross-validation. Pattern Recognition Letters, 65, 211–217.
Parwekar, P. (2021). Prediction of coefficient of consolidation in
soil using machine learning techniques. Microprocessors and Zefirov, N. S. & Palyulin, V. A. (2001). QSAR for boiling points of
Microsystems, 82, 103830. ‘small’ sulfides. Are the ‘high-quality structure-property-activity
regressions’ the real high quality QSAR models? Journal of
Miyata, Y., Bathurst, R. & Allen, T. (2019). Calibration of PET strap Chemical Information and Computer Sciences, 41, No. 4,
pullout models using a statistical approach. Geosynthetics 1022–1027.
International, 26, No. 4, 413–427.
Zhang, W., Wu, C., Zhong, H., Li, Y. & Wang, L. (2021). Prediction
Pant, A. & Ramana, G. (2022). Novel application of machine learning of undrained shear strength using extreme gradient boosting
for estimation of pullout coefficient of geogrid. Geosynthetics and random forest based on Bayesian optimization. Geoscience

International, 29, No. 4, 1–14. Frontiers, 12, No. 1, 469–477.

The Editor welcomes discussion on all papers published in Geosynthetics International. Please email your contribution to


Geosynthetics International
Downloaded by [ RMIT University] on [01/03/24]. Copyright © ICE Publishing, all rights reserved.


×