Tải bản đầy đủ (.pdf) (21 trang)

Prediction of marshall design parameters of asphalt mixtures via machine learning algorithms based on literature data

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.98 MB, 21 trang )

Road Materials and Pavement Design

ISSN: (Print) (Online) Journal homepage: www.tandfonline.com/journals/trmp20

Prediction of Marshall design parameters of
asphalt mixtures via machine learning algorithms
based on literature data

Mert Atakan & Kürşat Yıldız

To cite this article: Mert Atakan & Kürşat Yıldız (2024) Prediction of Marshall design
parameters of asphalt mixtures via machine learning algorithms based on literature data,
Road Materials and Pavement Design, 25:3, 454-473, DOI: 10.1080/14680629.2023.2213774
To link to this article: />
Published online: 23 May 2023.
Submit your article to this journal
Article views: 237
View related articles
View Crossmark data

Full Terms & Conditions of access and use can be found at
/>
ROAD MATERIALS AND PAVEMENT DESIGN
2024, VOL. 25, NO. 3, 454–473
/>
Prediction of Marshall design parameters of asphalt mixtures via
machine learning algorithms based on literature data

Mert Atakan and Kürşat Yıldız

Department of Civil Engineering, Faculty of Technology, Gazi University, Ankara, Turkey



ABSTRACT ARTICLE HISTORY
Received 21 November 2021
Previous studies have achieved accurate predictions for Marshall design Accepted 9 May 2023
parameters (MDPs), but their limited data and input variables might restrict
generalization. In this study, machine learning (ML) was used to predict KEYWORDS
MDPs with more generalised models. To achieve this, a dataset was col- Asphalt mixture design;
lected from six different papers. Inputs were material properties and their machine learning; Marshall
ratios in the mixture, while target features were six MDPs used in mixture design; prediction model;
design. Four ML algorithms were used including linear regression, polyno- virtual design
mial regression, k nearest neighbour (KNN) and support vector regression
(SVR). Also, the cross-validation (CV) method was used to detect the gen-
eralisation capability of the models. Accuracy of the SVR was the highest,
however, in nested CV its performance was highly reduced. Therefore, KNN
was recommended due to its second highest performance. The results
demonstrated that prediction of MDPs from only material properties is
possible and promising to use in mixture design.

Abbreviations: ANN: artificial neural network; BC: bitumen content; BP:
bitumen penetration (1/10 mm); CV: cross-validation; DEM: discrete ele-
ment method; GA: genetic algorithm; Gmb: Bulk specific gravity of mixture;
Gmm: Maximum specific gravity of mixture; Gsb: bulk specific gravity of
aggregate; KNN: k nearest neighbour; LA: Los Angeles abrasion; LR: linear
regression; MARS: multivariate adaptive regression spline; MDP: Marshall
design parameter; MF: Marshall flow; MQ: Marshall quotient (kN/mm); MS:
Marshall stability; NMAS: nominal maximum aggregate size; NoB: number
of blows; PI: penetration index; PR: polynomial regression; R2: coefficient
of determination; SP: softening point (°C); SVR: support vector regression;
UPVT: ultrasonic pulse velocity–time; Va: air voids percentage; VFA: voids
filled with asphalt; VMA: voids in mineral aggregate; WA: water absorption.


1. Introduction

The durability and performance of the asphalt pavement are highly affected by the mechanical and
volumetric properties of the mixture (Sebaaly et al., 2018). In order to provide required mechanical and
volumetric characteristics, asphalt mixture design is carried out. In other words, mixture design is the
most significant factor that affects road performance.

Today, Marshall, Hveem, and Superpave methods are commonly used design methods in the world
(Jiang et al., 2018). There are some differences between these methods such as compaction style, size
of the specimen and mechanical tests applied to the specimen. But basically, mixture design starts
by producing various asphalt mixture specimens in different binder content and gradation. Then, it

CONTACT Mert Atakan

© 2023 Informa UK Limited, trading as Taylor & Francis Group

ROAD MATERIALS AND PAVEMENT DESIGN 455

is determined which of the specimens meet the necessary performance criteria. These methods are
very time-consuming, demanding and expensive. For example a Superpave mixture design might take
approximately 7.5 working day (Ozturk & Emin Kutay, 2014). That is why, a prediction-based approach
in mixture design is vitally important. Accordingly, predicting the physical and mechanical properties
of the asphalt mixture from its material characteristics without excessive laboratory work is essential.
In this regard, researchers have employed two basic approaches: numerical simulations (e.g. Discrete
element method (DEM), finite element method, and user defined algorithms) and soft computing
methods like machine learning (ML) (Liu et al., 2022).

There have been many studies using numerical methods to predict characteristics of asphalt con-
crete such as air voids, density, rutting, etc. Li and Wang (2017) have performed a Marshall design by

predicting Marshall characteristics of the virtual specimens produced in the DEM simulation. Similarly,
Shen and Yu (2011) have used DEM to predict voids in mineral aggregate (VMA) of the asphalt con-
crete. Jin et al. (2022) have used a user defined algorithm to produce internal structure of the asphalt
concrete based on aggregate contacts. Also, a large and growing body of literature has investigated
physics engine simulation to produce virtual asphalt specimens. Garcia-Hernandez et al. (2021) have
produced virtual Marshall specimens via a physic engine called as Nvidia PhysX to predict air void
content of the asphalt specimens. Likewise, Komaragiri et al. (2021) have used bullet physics engine
to simulate gyratory compaction to predict density of the asphalt specimens.

ML basically means extracting knowledge from data (Müller & Guido, 2016). To achieve this, the data
are arranged as a table where columns represent features and rows represent a single observation or a
case (Theobald, 2017). Then, input and target variables are selected. After that, data was split into two
as training and test data. Once the data are split, a prediction model is trained using the training data
via various learning algorithms. This model can predict target values from the input values. Finally,
the performance of the model is measured with test data by comparing prediction values and the real
values.

Many studies have been done to predict the mechanical and physical properties of asphalt mix-
ture using either ML algorithms or soft computing techniques such as artificial neural network (ANN),
genetic algorithm (GA) and fuzzy logic. Majidifard et al. (2019, 2020) have built a gene expression and
a deep learning ML model to predict the rut depth and the fracture energy. Therefore, they have been
able to make asphalt mixture designs based on these predictions. Miani et al. (2021) have created an
ANN model to predict basic characteristics of the asphalt such as stiffness modulus, Marshall stability
(MS), Marshall flow (MF) and air voids percentage (Va). Some other previous studies are listed in Table 1.
Volumetric characteristics of asphalt specimens (e.g. Va, VMA, voids filled with asphalt (VFA)) have been
used to predict mechanical properties such as MS, MF and stiffness in a considerable amount of litera-
ture. For instance, Aksoy et al. (2012) used some Marshall design parameters (MDPs) such as Va, density,
VMA, etc. as inputs to predict MS, MF and Marshall quotient (MQ). These parameters are well corre-
lated with Marshall test results, however, they are obtained with experiments. Unless produce Marshall
specimens, we cannot predict MS, MF and MQ with this type of model. In other words, although these

kinds of models have made successful predictions, they are not sufficient to reduce laboratory labour
and time. Thus, it is necessary to build a prediction model that does not require any experimental
input variable such as Va, VMA or VFA. More specifically, a better prediction model should use mate-
rial properties as inputs such as bitumen type, aggregate gradation, etc. In this way, it is possible to
predict all MDPs without producing any specimen. For example, Azarhoosh and Pouresmaeil (2020),
Nguyen et al. (2019), Sebaaly et al. (2018), Khuntia et al. (2014) and Ozgan (2009) have not used volu-
metric parameters as inputs. Therefore, their models could be used to predict design parameters in the
future without any laboratory work with high prediction accuracy. However, these studies might not
be generalised with high accuracy due to limited input features, feature range (e.g. a couple of bitumen
type or aggregate type) or small dataset size. To sum up, a part of the previous studies has used some
MDPs as input variables that cannot reduce laboratory labour. Others have created high-performance
prediction models, but their models might not be generalised. In other words, they may not work
properly with other bitumen types and aggregate types or in another laboratory environment.

Ozgan (2009) Tapkin et al. (2009) Tapkin et al. (2010) Mirzahosseini et al. (2011) Ozgan (2011) Gandomi et al. (2011) Aksoy et al. (2012) Khuntia et al. (2014) Sebaaly et al. (2018) Baldo et al. (2018) Nguyen et al. (2019) Ghanizadeh et al. (2020) Azarhoosh and Pouresmaeil (2020) Shah et al. (2020) Reference Table 1. Summary of the previous studies.

•• • • • • • Coarse/fine aggregate percentage
Filler percentage
• Aggregate type
Aggregate shape and texture
• Gsb
Filler/bitumen ratio
• Binder type
Binder ratio
• • • •• •• • •• • •• Marshall test temperature
Exposure time to test temperature
•• •• • Ultrasonic pulse velocity–time of sample
Sample volume
•• Sample height Input variables
Sample production method

• • • Number of blows in compaction
Saturated surface dry specific gravity
• Gmb
MS
• MF
MQ
• • •• • •• Va
VMA
•• •• VFA
Other additives type/ratio
•••• • •••• • •• • •• ••• ••• •• Repeated creep test properties

• •

M. ATAKAN AND K. YILDIZ 456

Table 1. Continued.

Predicted variables Method

Reference Gmm
MS
MF
MQ
Va
VMA
VFA
Indirect tensile strength
Stiffness modulus
Flow number

GA
Particle swarm optimisation
Support vector machine
ANN
KNN
Multiple LR
Multivariate adaptiveregression spline
Fuzzy inference systems

Shah et al. (2020) • • • • •

Azarhoosh and Pouresmaeil (2020) • • • • • • • ROAD MATERIALS AND PAVEMENT DESIGN

Ghanizadeh et al. (2020) • •

Nguyen et al. (2019) • • • • • • •

Baldo et al. (2018) • • • • •

Sebaaly et al. (2018) • • • • • •

Khuntia et al. (2014) • • • • •

Aksoy et al. (2012) • • • • •

Gandomi et al. (2011) • • •

Ozgan (2011) • •

Mirzahosseini et al. (2011) • • •


Tapkin et al. (2010) • • • •

Tapkin et al. (2009) • •

Ozgan (2009) • •

457

458 M. ATAKAN AND K. YILDIZ

In this study, one of the aims is to establish a prediction model whose inputs are composed merely
of material properties, in order to achieve producing virtual Marshall specimens without any laboratory
effort in the future. Therefore, it is considered more input variables at the same time than previous
studies, and some of them are used for the first time such as Los Angeles abrasion (LA), penetra-
tion index (PI), softening point (SP) and bitumen penetration (BP). The other aim is to obtain a more
generalised model. To achieve this, various datasets were combined from different studies. In this
way, there will be various data from different laboratories and the constructed dataset will be more
representative.

2. Methodology

First of all, it is important to state that in this study, Python programming language and its libraries
such as scikit-learn were used. The methodology that we followed in this study is demonstrated with
a flow chart in Figure 1. First of all, data from different studies were collected, then they were modified
and combined in a dataset. After that, two columns were added to the dataset. The first one was the PI
calculated using SP and BP values. The second one was the MQ which was calculated as the proportion
of the MS and MF values. Once these two columns were added, missing values were imputed using
DataWig library which is later explained in detail on the title 2.1.2. After that, another column named
VFA was added to the dataset. VFA was calculated by using VMA and Va values. It is important to state

a point here. Because some of the Va and VMA values were missing at the beginning, we could not
add the VFA column before imputing missing values. Instead, we add after imputation the missing
values.

Once the dataset was completed, input and target features were determined. Next, the data were
divided into train and test sets without any scaling. Then, linear regression (LR) was applied to training
data. However, for the other models, the data were scaled using the standard scaler function in the
scikit-learn library before training. In the training process, two different approaches were employed. At
the first one, the dataset was randomly divided as train and test sets with the train–test split function.
Then the models were trained with the train set. In the second approach, the dataset was divided
into more than one train and test group using k-fold cross-validation (k-fold CV) and coefficient of
determination (R2) values were calculated for each model. The differences between these approaches
are explained further in the related titles.

2.1. Dataset

2.1.1. Data collection
Data from six different studies were used to create a bigger dataset (Table 2). In total, there were 407
rows, namely specimens, in the dataset. The steps of creating the dataset are as follows:

1. Choosing features
2. Getting data from each study for the chosen features
3. Joining the data from all studies into one dataset

First, we determined 14 specimen features in total considering previous studies and important fea-
tures that might affect the mixture design results. In other words, basic material characteristics that can
change MDPs such as Va, MS or MF were used as input features. Also, when choosing input features,
we attached importance especially not to choose a MDP as an input. Although MDPs like Va might
highly correlate with MS and MF, the real aim of this study was to predict all MDPs without producing
any Marshall specimens. That is why all of six MDPs were chosen as target features. Once all features

were determined, it has generated three additional features (i.e. PI, VFA and MQ) from current column
values. These features are demonstrated in Table 3.

ROAD MATERIALS AND PAVEMENT DESIGN 459

Figure 1. Workflow diagram of the current study.

The challenges we encountered when creating the data set are given below:

• Some of the researchers shared the gradation of aggregates as a graphic. We used a software named
Get Data Graph Digitilizer to get the exact percent of the coarse and fine aggregates. This software
can scale the image and draw a new readable graphic over again. In this way, necessary values can
be read from the graphs.

• While some researchers have defined coarse aggregate as bigger than 4.75 mm, others have
defined it as bigger than 2.36 mm. We assumed coarse aggregate as bigger than 2.36 mm. That
is why, when creating the dataset, all data that we got from the studies was modified according to
this assumption.

460 M. ATAKAN AND K. YILDIZ

Table 2. Number of specimens of the studies that comprise this study’s dataset.

Reference Number of specimens

Mirzahosseini et al. (2011) 118
Azarhoosh and Pouresmaeil (2020) 90
Nguyen et al. (2019) 60
Baldo et al. (2018) 60
Aksoy et al. (2012) 63

Tapkin et al. (2010) 16
Total 407

Table 3. Features of the created dataset. Bitumen Mixture
Aggregate
1. Nominal maximum aggregate size (NMAS) (mm) 6. BP (1/10 mm) 10. BC (%)
2. Coarse agg. (%) 7. SP (°C) 11. Va (%)
3. Filler (%) 12. VMA (%)
4. LA (%) 8. Number of blows 13. VFA (%)a
5. WA (%) 9. PIa 14. MS (kN)
15. MF (mm)
aFeatures generated from other columns. 16. MQ
(kN/mm)a
17. Gsb (g/cm3)

• In some studies, bitumen content (BC) was presented by the weight of the mixture (e.g. Baldo
et al., 2018). These values were transformed into by weight of the aggregate to provide equivalency
among the different studies.

• In Baldo et al. (2018), the number of blows used in the compaction has not been presented. It has
been said that specimens were prepared according to EN 12697-30. This standard states that the
number of blows should be between 25 and 100, but it also states it is generally used as 50 blows.
Therefore, the number of blows for compaction was assumed as 50 blows for the study named
(Baldo et al., 2018).

• Units were converted to the same unit for each feature.

2.1.2. Handling missing values
There were some missing values in the dataset because some features were not presented in the stud-
ies we used. DataWig library was used to impute missing values. This approach employs automatic

hyperparameter tuning in deep learning feature extraction. Therefore, even users who do not have
deep learning background can benefit from the library (Biessmann et al., 2019).

We used one of DataWig functions named ‘SimpleImputer.complete’ to impute missing values.
This function fits an imputation model for each column by choosing all other columns as inputs. The
statistical description of the data with missing values and after imputation are presented in Tables 4
and 5.

Once missing values were imputed, the dataset had been created completely. However, some fea-
tures were not used in the model such as water absorption (WA) and bulk specific gravity of aggregate
(Gsb) since they have a high number of missing values in the first place. In other words, using these
features might have led to high bias in the models, therefore they were not used in the models.

2.1.3. Splitting the data as train and test sets
The ‘train_test_split method’ in the scikit-learn library was used to split the data into train and test
sets. Train and test sizes were chosen as 0.67 and 0.33. Random state number, which provides the
same splitting state every time the code runs, was selected as 0.

Table 4. Statistical description of the data with missing values.

Count NMAS (mm) Coarse agg. (%) Filler (%) BC (%) Va (%) VMA (%) MS (kN) MF (mm) LA (%) WA (%) Gsb (g/cm3) BP (1/10 mm) SP (°C) NoB MQ PI
Mean
Std 407 407 407 407 347 347 407 407 391 289 257 407 407 407 407 407
Min 13.84 63.70 6.28 5.50 4.83 15.96 11.15 3.19 20.24 1.35 2.58 63.78 52.82 64.32 3.72 −0.02
25% 3.63 11.98 2.41 0.81 1.60 1.37 2.85 0.77 5.54 0.73 0.09 11.39 8.55 12.06 1.34 0.06
50% 9.50 33.00 1.00 3.50 0.40 12.10 2.73 1.60 12.00 0.57 2.49 45.00 45.60 45.00 0.61 −0.12
75% 12.50 56.00 5.00 5.00 3.73 14.90 8.98 2.66 16.22 0.80 2.49 62.00 49.00 50.00 2.72 −0.06
Max 12.50 67.00 6.00 5.40 4.60 16.00 10.90 3.10 25.00 0.90 2.62 63.00 49.00 75.00 3.44 −0.05
19.00 68.00 7.00 6.00 6.03 17.03 13.22 3.60 25.00 2.20 2.66 65.00 52.05 75.00 4.53 0.01
19.00 83.00 10.50 7.50 9.44 19.04 18.33 6.90 26.00 2.40 2.71 91.00 78.80 75.00 7.35 0.13


Table 5. Statistical description of the data with imputed values.

Count NMAS (mm) Coarse agg. (%) Filler (%) BC (%) Va (%) VMA (%) MS (kN) MF (mm) LA (%) WA (%) Gsb (g/cm3) BP (1/10 mm) SP (°C) NoB MQ PI ROAD MATERIALS AND PAVEMENT DESIGN
Mean
std 407 407 407 407 407 407 407 407 407 407 407 407 407 407 407 407
Min 13.84 63.70 6.28 5.50 4.83 16.30 11.15 3.19 20.12 1.92 2.61 63.78 52.82 64.32 3.72 −0.02
25% 3.63 11.98 2.41 0.81 1.50 1.54 2.85 0.77 5.47 1.10 0.09 11.39 8.55 12.06 1.34 0.06
50% 9.50 33.00 1.00 3.50 0.40 12.10 2.73 1.60 12.00 0.57 2.49 45.00 45.60 45.00 0.61 −0.12
75% 12.50 56.00 5.00 5.00 3.80 15.12 8.98 2.66 16.22 0.80 2.49 62.00 49.00 50.00 2.72 −0.06
Max 12.50 67.00 6.00 5.40 4.70 16.26 10.90 3.10 25.00 2.20 2.62 63.00 49.00 75.00 3.44 −0.05
19.00 68.00 7.00 6.00 5.78 17.47 13.22 3.60 25.00 2.96 2.71 65.00 52.05 75.00 4.53 0.01
19.00 83.00 10.50 7.50 9.44 19.77 18.33 6.90 26.00 4.07 2.73 91.00 78.80 75.00 7.35 0.13

461

462 M. ATAKAN AND K. YILDIZ

2.1.4. Scaling the data
The data were scaled before training for all models except for LR. We used the standard scaler in the
sckit-learn library. It standardises features by transforming the data into which has a mean value of 0
and standard deviation value of 1. The standard scaled value of sample X was calculated as Equation (1)
where x¯ is the mean of the samples and σ is the standard deviation of the samples.

z = x − x¯ . (1)
σ

2.2. Model performance assessment

2.2.1. Prediction performance

In order to assess the prediction performance of the models, we used the score function in the scikit-
learn library. This function returns the coefficient of determination, namely R2 value. It was calculated
as Equation (4) where RSS is the residual sum of squares and TSS is the total sum of squares. In
Equation (2), yi is ith value to be predicted, f (Xi) is the predicted value of yi, and n is the upper limit
of summation. In Equation (3), yi is ith value in sample, y¯ is the mean value of the sample and n is the
upper limit of summation.

The value of R2 can be as high as 1.00 at maximum. It can be also negative when the prediction
performance is poor. Therefore, the closer this value is to 1.00, the higher the prediction performance
of the model.

n

RSS = ((yi − f (Xi))2, (2)

i=1

n

TSS = (yi − y¯)2, (3)

i=1

R2 = 1 − RSS . (4)
TSS

2.2.2. k-Fold CV
CV is used to assess the generalisation performance of the prediction model. It is more stable and
comprehensive than the basic train/test split method. One of the most common CV methods is k-fold
CV. The k number is to be decided by the user which is commonly chosen as 5 or 10 (Müller & Guido,

2016). We used the cross_val_score function in the scikit-learn library to perform CV. The number for k
was selected as 5 except for the support vector regression (SVR) model. In addition, the random state
was 1, and the shuffle option was true. Once parameters were decided, this function divided the data
into five parts which are called folds. Then, five different training were accomplished which are called
splits (Figure 2). For instance, in split 1, training data were composed of fold 2–5 and test data were
composed of fold 1. Finally, the CV score was calculated as an average of the accuracy scores of five
models established in every split.

2.2.3. Grid search in SVR
Grid search is a function that tries certain parameters one by one and gives the best parameters. We
used GridSearchCV function in the scikit-learn library to make the grid search. This function also uses
CV to reduce overfitting and bias. Therefore, we selected a five-fold CV option. First, the data were
scaled and divided into test and training sections. The training set was to use in the grid search and
the test set was to use in calculating the test set score. Next, a parameter grid was created as seen in
Table 6. There were six different parameters for C and γ , which are both hyperparameters used in SVR
algorithm, it made up 6∗6 = 36 combination. Since five-fold CV was also used, 36∗5 = 180 models
were built to choose the best parameters. Then, the model accuracy is calculated in every model and

ROAD MATERIALS AND PAVEMENT DESIGN 463

Figure 2. Illustration of k-fold CV when k = 5.

Table 6. Parameters used in grid search.

# C γ

1 1 0.0001

2 10 0.001


3 100 0.01

4 1000 0.1

5 10,000 1

6 100,000 10

the best parameters were found. Finally, test set scores were calculated on the test set data when the
best parameters are used. This process was repeated for every target value.

2.2.4. Nested CV in SVR
As mentioned before, when GridSearchCV function was used, first we divided the data training and test
section. Then we used the test set to obtain model performance. However, results relied on too much
that single split in this method. The actual score of the more generalised model might be a lot different.
In order to obtain a more generalised accuracy score, we used multiple CVs which are called nested
CV. In this method, first data were divided with k-fold CV, then grid search CV was applied in every fold.
Since k was chosen as 5, that number of accuracy scores was found for every target variable. Therefore,
while there were 180 models in the normal grid search CV, in nested CV, there were 180 ∗ 5 = 900
models for every target variable.

2.3. Algorithms

We used four different techniques to create prediction models, namely, LR, polynomial regression (PR),
k nearest neighbour (KNN) and SVR. These algorithms were chosen based on previous studies. The
model performance was measured with the coefficient of determination.

2.3.1. LR
The ‘linear_model.LinearRegression’ method in scikit-learn library was used to carry out LR. The model
was trained with the training set of data obtained by the train/test split method. Differently from other


464 M. ATAKAN AND K. YILDIZ

algorithms, the data were used without scaling, because it was found to be not necessary for the LR.
Lastly, model performance was evaluated according to R2 values.

2.3.2. PR
First of all, the data were scaled and divided into training and test sections. Next, the polynomial feature
method was used in the scikit-learn library to obtain a certain degree of the input variables. Then, vari-
ous degrees up to 10 were tried for all target values and the second-degree option was selected due to
providing the best results. After that, the linear_model.LinearRegression function was applied to the
train set in order to create the model. Once the model was trained, model accuracy was determined
using test data. Besides, k-fold CV score was calculated to detect the generalisation performance of
the second-degree PR.

2.3.3. KNN
First of all, the data were scaled, then it was split into train and test sections. KNeighborsRegressor
function was used to train the model. KNN has two important parameters. The first one is the number of
neighbours and the second one is the measuring method of the distance between data points (Müller
& Guido, 2016). We tried 10 different neighbour numbers from 1 to 10 and the distance function was
chosen as Minkowski, which is the default value. Besides, k-fold CV scores were calculated where the
number of splits was 5, the shuffle was true and the random state was 1.

2.3.4. SVR
Because SVR is sensitive to chosen parameters, grid search was applied in the SVR model. First, the data
were scaled, and then it was split into train and test sections. Then, grid search was carried out with
the train section of the data, and the best parameters were chosen. After using the best parameters,
model performance was determined on test data. It was also applied nested CV in the SVR model. In
this method, k-fold CV and grid search functions were performed as nested.


3. Results and discussion

3.1. Limitations

There are two major limitations in this study that could be addressed in future research. First, we used
data from previous papers in this study. Therefore, the data are not well structured in terms of the dis-
tribution of the values and the variety of the variables. Second, there were some missing values in the
dataset because these values were not presented in the past studies. That is why we used data impu-
tation to handle the missing values which might create some bias in the prediction models. Future
research can focus on using a more organised dataset by producing the specimens all together in
the laboratories. As an alternative, the data imputation method used in this study can be verified
experimentally in future research.

3.2. LR

Regression analysis was used to predict the target parameters of the asphalt mixture. The performance
of the LR model on the test set is presented in Figure 3. What stands out in the graphs is that the accu-
racy score for the prediction of MF is 0.36 which is the lowest score among the predicted parameters.
Similarly, the accuracy of predicted MS values is also not very high which is 0.54. Interestingly, the
accuracy score of predicted MQ is particularly higher than both those of MS and MF. Moving on to vol-
umetric parameters, VMA and VFA were predicted more accurately than Marshall parameters. On the
other hand, the prediction performance of Va is similar to Marshall parameters. Overall, the accuracy
of the LR was not very high. But it may give an idea about the relationship between input and output
variables especially when it is applied every input parameter separately.

ROAD MATERIALS AND PAVEMENT DESIGN 465

Figure 3. Performance of first-degree LR.

3.3. PR


Second-degree PR was used to predict the target parameters of the asphalt mixture. The performance
of the PR model on the test set is presented in Figure 4. The most significant aspect of these results is
that prediction performance is much better than LR for all parameters. Besides, accuracy scores of all
parameters are higher than 0.80 except for MS. It can be considered a high score.

Furthermore, the k-fold CV method was used to measure the performance of the PR model. The
results are shown in Table 7. There were five folds and their R2 values are presented in the table for
each target value. A closer inspection of the table shows that some of R2 values are high and some of
them are low. Therefore, it is better to consider mean values for each target parameter. When these
mean values of R2 are examined, they are observed to be parallel with those of normal test data scores.
Then again there was an apparent decrease in model accuracy of MS differently from other output
variables.

466 M. ATAKAN AND K. YILDIZ

Figure 4. Performance of second-degree PR.

3.4. KNNs

KNN algorithm was used in order to predict target parameters of asphalt mixture. Using a random
train–test set, the relationship of model accuracy and the neighbour number was plotted for all target
values in Figure 5. As expected, the accuracy of the train set was clearly higher than those of the test set
in all output variables. As the neighbour number was increased, model accuracy increased at first but
afterward, there was a steady decrease beginning from a certain point. This point can be considered
as an optimum neighbour number. The optimum neighbour numbers according to test sets and their
R2 values are presented in Table 8. Prediction accuracy of most of the target parameters was better at
the two-neighbour option, therefore the performance of all the output variables was evaluated using
the two-neighbour option (Figure 6).


ROAD MATERIALS AND PAVEMENT DESIGN 467

Figure 5. KNN model accuracy according to neighbours numbers.

Moreover, CV was applied to the model in different neighbour numbers (Figure 7). Except for MS
and MF, the best performance was seen in the two-neighbour option. MS and MF, on the other hand,
showed the best performance in 1 and 4 neighbours, respectively. Accuracy scores of CV are resem-
bling the test set scores which were calculated from the normal train–test split. Therefore, these scores
might be generalised, and using these KNN models in mixture design could be beneficial.

3.5. Support vector regressor
SVR was used to predict the target parameters of the asphalt mixture. First, grid search CV was carried
out to determine the best parameters of the SVR model. Accuracy scores obtained from grid search
CV are illustrated in Figure 8 for a range of C and γ parameters. In the figure, the best score represents
the mean value of the splits in the CV scores obtained from training data. Parameters that provide the

468 M. ATAKAN AND K. YILDIZ

Table 7. Second-degree regression scores with k-fold CV.

Target k-fold regression scores (R2)

parameter 1 2 3 4 5 Mean

MS 0.70 0.90 0.19 0.76 0.77 0.66
0.81
MF 0.79 0.86 0.81 0.79 0.78 0.81
0.80
MQ 0.76 0.91 0.65 0.85 0.87 0.85
0.86

Va 0.90 0.89 0.69 0.81 0.74

VMA 0.88 0.92 0.80 0.88 0.78

VFA 0.93 0.92 0.77 0.87 0.82

Table 8. Prediction accuracy of KNN models at their best neighbour
numbers.

Target parameter Best neighbour number Test set score (R2)

MS 2 0.86

MF 4 0.83

MQ 2 0.89

Va 2 0.87

VMA 3 0.93

VFA 2 0.87

Table 9. Nested CV scores for SVR.

Target Scores (R2) of each split

parameter 1 2 3 4 5 Mean

MS 0.79 0.88 0.4 0.76 0.81 0.73

0.32
MF 0.86 0.12 0.18 0.23 0.2 0.71
0.78
MQ 0.61 0.89 0.58 0.57 0.88 0.89
0.86
Va 0.52 0.86 0.95 0.65 0.9

VMA 0.9 0.91 0.94 0.86 0.86

VFA 0.93 0.92 0.96 0.81 0.66

best scores are selected as best parameters which can be seen in the figure as lighter tones. These are
the values closest to 1 and written as the best score above the color maps. The other two accuracy
scores were calculated using the best parameters. The first one, namely train set score, is the model
accuracy in the train set, and the second one, namely test set score, is model accuracy for the test set
which is not used in the grid search. Also, nested CV scores for each target parameter are presented
in Table 9. The difference of this method from the former one is that CV and grid search CV were used
as nested; in other words one within the other. The accuracy scores in the table were slightly lower
than the normal grid search CV scores for all target variables except for MF. A major drop was seen in
the accuracy score of MF. Thus, the prediction performance of SVR in MF cannot be generalised with
this dataset.

3.6. Comparisons of the algorithms

The prediction performance of all the models for six target variables were presented in Figure 9. Over-
all, LR showed the worst performance for all targets. The performance of the PR was far better than
LR. KNN and SVR had clearly higher performance than both linear and PR models. Among KNN and
SVR, the latter one has a slightly better accuracy score. However, nested CV scores for SVR were lower
than both KNN and SVR with grid search. It means SVR showed worse generalisation performance than
KNN. Therefore, KNN might be the best option to predict the MDPs of asphalt specimens.


ROAD MATERIALS AND PAVEMENT DESIGN 469

Figure 6. KNN model accuracy when neighbour number is 2 (train–test split, random size = 0).

4. Conclusions
The purpose of the current study was to create a more generalised model to predict MDPs of the
asphalt mixture. The following conclusions can be drawn from the present study:

• Prediction accuracy of the LR was not high, but it may be used to get a general idea about data.
• Second-degree PR had better prediction performance than LR, but it was still low in comparison to

KNN.
• In the KNN algorithm, the two-neighbour option gave the best results for most of the target

variables.

470 M. ATAKAN AND K. YILDIZ

Figure 7. k-Fold CV scores according to neighbour number.

• KNN algorithm showed a good correlation between prediction and real values even in the CV
method. Therefore, KNN might be used in mixture design to reduce laboratory work.

• Although SVR with grid search CV had the best prediction score, KNN might be preferred instead
of SVR. Because SVR is a parameter-sensitive method, and its prediction performance was reduced
in nested CV, its generalisation performance was fairly worse than KNN.

• The KNN algorithm is a better choice to predict MDPs in collected datasets like in this study
compared to SVR. However, SVR might be useful in datasets produced in the same laboratory.


• Making CV is important to reduce the bias and overfitting which stem from the selection of the train
and test groups.

In spite of its limitations, the study certainly adds to our understanding of the prediction physical
and mechanical properties of the asphalt mixture. The study showed it is possible to make accurate

ROAD MATERIALS AND PAVEMENT DESIGN 471

Figure 8. Grid search CV scores of SVR.

predictions from only material properties with a generalised model. In other words, once a ML model
that can predict MDPs is built, it is possible to produce virtual Marshall specimens in computer. Then
these specimens can be used to make asphalt mixture design. However, this kind of design method
should be verified with an experimental study. Research is also required to determine necessary inputs
and their importance to model using a well-organised dataset in future studies. As well as efficiency
of the data imputation method used in this study should be evaluated with laboratory work in further
research.

472 M. ATAKAN AND K. YILDIZ

Figure 9. Comparisons of the model scores for all the target parameters.

Disclosure statement

No potential conflict of interest was reported by the authors.

ORCID /> />Mert Atakan
Kürşat Yıldız


References

Aksoy, A., Iskender, E., & Tolga Kahraman, H. (2012). Application of the intuitive k-NN estimator for prediction of
the Marshall test (ASTM D1559) results for asphalt mixtures. Construction and Building Materials, 34, 561–569.
/>
Azarhoosh, A., & Pouresmaeil, S. (2020). Prediction of Marshall mix design parameters in flexible pavements using genetic
programming. Arabian Journal for Science and Engineering, 45(10), 8427–8441. /> 776-0

Baldo, N., Manthos, E., & Pasetto, M. (2018). Analysis of the mechanical behaviour of asphalt concretes using artificial neural
networks. Advances in Civil Engineering, 2018(1), 1–17. />
Biessmann, F., Rukat, T., Schmidt, P., Naidu, P., Schelter, S., Taptunov, A., Lange, D., & Salinas, D. (2019). Datawig: Missing
value imputation for tables. Journal of Machine Learning Research, 20(175), 1–6. />
Gandomi, A. H., Alavi, A. H., Mirzahosseini, M. R., & Nejad, F. M. (2011). Nonlinear genetic-based models for prediction of
flow number of asphalt mixtures. Journal of Materials in Civil Engineering, 23(3), 248–263. /> MT.1943-5533.0000154

Garcia-Hernandez, A., Wan, L., & Dopazo-Hilario, S. (2021). In-silico manufacturing of asphalt concrete. Powder Technology,
386, 399–410. />
Ghanizadeh, A. R., Jahanshahi, F. S., Khalifeh, V., & Jalali, F. (2020). Predicting flow number of asphalt mixtures based
on the Marshall mix design parameters using multivariate adaptive regression spline (MARS). International Journal of
Transportation Engineering, 7(4), 433–448. />
Jiang, Y., Deng, C., Xue, J., & Chen, Z. (2018). Investigation into the performance of asphalt mixture designed using different
methods. Construction and Building Materials, 177, 378–387. />
Jin, C., Feng, Y., Yang, X., Liu, P., Ding, Z., & Oeser, M. (2022). Virtual design of asphalt mixtures using a
growth and contact model based on realistic aggregates. Construction and Building Materials, 320, 126322.
/>

×