Tải bản đầy đủ (.doc) (47 trang)

An efficient on demand charging for wrsns using fuzzy logic and q learning

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (716.37 KB, 47 trang )

HANOI UNIVERSITY OF SCIENCE AND
TECHNOLOGY

Master’s Thesis in Data Science and
Artificial Intelligence
An Efficient, On-demand Charging for
WRSNs Using Fuzzy Logic and QLearning
La Van Quan

Supervisor:

Dr. Nguyen Phi Le

Department:

Department of Software engineering

Institute:

School of Information and Communication Technology

Hanoi, 2022


Declaration of Authorship and Topic Sentences
1. Personal information Full
name: La Van Quan Phone
number: 039 721 1659
Email:
Major: Data Science and Artificial Intelligence
2. Topic


An Efficient, On-demand Charging for WRSNs
Using Fuzzy Logic and Q-Learning
3. Contributions
• Propose a Fuzzy logic-based algorithm that determines the energy
level to be charged to the sensors.
• Introduce a new method that optimizes the optimal charging time at
each charging location to maximize the number of alive sensors.
• Propose Fuzzy Q-charging, which uses Q-learning in its charging
scheme to guarantee the target coverage and connectivity.
4. Declaration of Authorship
I hereby declare that my thesis, titled An Efficient, On-demand Charging
for WRSNs Using Fuzzy Logic and Q-Learning , is the work of myself and
my supervisor Dr. Nguyen Phi Le. All papers, sources, tables, and so on
used in this thesis have been thoroughly cited.
5. Supervisor confirmation
.............................................................................
.............................................................................
.............................................................................
Ha Noi, April 2022
Supervisor

Dr. Nguyen Phi Le

i


Acknowledgments
I would like to thank my supervisor, Dr. Nguyen Phi Le, for her continued
support and guidance throughout the course of my Masters’ studies. She has
been a great teacher and mentor for me since my undergraduate years, and I

am proud to have completed this thesis under her supervision.
I want to thank my family and my friends, who have given me their
unconditional love and support to finish my Masters’ studies.
Finally, I would like to again thank Vingroup and the Vingroup Innovation
Foundation, who have supported my studies through their Domestic
Master/Ph.D Scholarship program.
Parts of this work were published in the paper Q-learning-based, Optimized
On-demand Charging Algorithm in WRSN by La Van Quan, Phi Le Nguyen,
Thanh-Hung Nguyen, and Kien Nguyen in the Proceedings of the 19th IEEE
International Symposium on Network Computing and Applications, 2020.
La Van Quan was funded by Vingroup Joint Stock Company and supported
by the Domestic Master/Ph.D. Scholarship Programme of Vingroup Innovation
Foun-dation (VINIF), Vingroup Big Data Institute, code VINIF.2020.ThS.BK.03.

ii


Abstract
In recent years, Wireless Sensor Networks (WSNs) have attracted great attention
worldwide. WSNs consist of sensor nodes deployed on an surveillance area to monitor
and control the physical environment. In WSNs, every sensor node needs to perform
several important tasks, two of which are sensing and communication. Every time the
above tasks are performed, the sensor’s energy will be lost over time. Therefore some
sensor nodes may die. A sensor node is considered dead when it runs out of energy.
Correspondingly, the lifetime of WSNs is defined as the time from the start of operation
until a sensor dies [1]. Thus, one of the important issues to improve the quality of
WSNs is to maximize the life of the network.
In classical WSNs, sensor nodes have fixed energy and always degrade over time.
The limited battery capacity of the sensor is always a "bottleneck" that greatly af-fects
the life of the network. To solve this problem, Wireless Rechargeable Sensor Networks

(WRSNs) were born. WRSNs include sensors equipped with battery charg-ers and one
or more mobile chargers (Mobile Chargers (MC)) responsible for adding power to the
sensors. In WRSNs, MCs move around the network, stopping at spe-cific locations
(called charging sites) and charging the sensors. Thus, it is necessary to find a
charging route for MC to improve the lifetime of WRSNs [2], [3].

Keywords: Wireless Rechargeable Sensor Network,
Reinforcement Learning, Q-Learning, Network Lifetime

Fuzzy
Author

La Van Quan

iii

Logic,


Contents
List of Figures
List of Tables

vi
vii

1 Introduction

1


1.1 Problem overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Thesis contributions . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Theoretical Basis

1
2
3
4

2.1 Wireless Rechargeable Sensor Networks . . . . . . . . . . . . . . . . .
2.2 Q-learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Literature Review

4
7
8
10

3.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Fuzzy Q-charging algorithm

10
11
13

4.1
4.2

4.3
4.4

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
State space, action space and Q table . . . . . . . . . . . . . . . . . .
Charging time determination . . . . . . . . . . . . . . . . . . . . . . .
Fuzzy logic-based safe energy level determination . . . . . . . . . . .
4.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2 Fuzzification . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.3 Fuzzy controller . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.4 Defuzzification . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Reward function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6 Q table update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Experimental Results

13
13
15
16
16
17
17
18
21
22
24

5.1 Impacts of parameters . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 Impacts of . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.2 Impacts of . . . . . . . . . . . . . . . . . . . . . . . . . . . .


25
25
26

iv


5.2 Comparison with existing algorithms . . . . . . . . . . . . . . . . . .
5.2.1
Impacts of the number of sensors . . . . . . . . . . . . . . . .
5.2.2
Impacts of the number of targets . . . . . . . . . . . . . . . .
5.2.3
Impacts of the packet generation frequency . . . . . . . . . . .
5.2.4 Non-monitored targets and dead sensors over time . . . . . . .
Bibliography

v

27
27
28
28
30
34


List of Figures
2.1

2.2
2.3
2.4
3.1

A wireless sensor network . . . . . . . . . . . . . . . . . . . . . . . .
A sensor structure . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Q-learning overview . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5
5
6
8
12

4.1 The flow of Fuzzy Q-learning-based charging algorithm . . . . . . . .

14

4.2
4.3
4.4
5.1

Illustration of the Q-table . . . . . . . . . . . . . . . . . . . . . . . .
Fuzzy input membership functions . . . . . . . . . . . . . . . . . . .
Fuzzy output membership function . . . . . . . . . . . . . . . . . . .
Impact of on the network lifetime . . . . . . . . . . . . . . . . . . .


14
18
19
25

5.2
5.3
5.4
5.5
5.6
5.7

Impact of on the network lifetime . . . . . . . . . . . . . . . . . . .
Network lifetime vs. the number of sensors . . . . . . . . . . . . . . .
Network lifetime vs. the number of targets . . . . . . . . . . . . . . .
Network lifetime vs. the packet generation frequency . . . . . . . . .
Comparison of non-monitored targets over time . . . . . . . . . . . .
Comparison of dead sensors over time . . . . . . . . . . . . . . . . . .

26
27
29
29
30
31

vi



List of Tables
4.1 Input variables with their linguistic values and corresponding membership function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Output variable with its linguistic values and membership function .
4.3 Fuzzy rules for safe energy level determination . . . . . . . . . . . . .
4.4 Inputs of linguistic variables . . . . . . . . . . . . . . . . . . . . . . .
4.5 Fuzzy rules evaluation . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1 System parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii

18
18
19
19
20
25


Chapter 1
Introduction
1.1

Problem overview

Wireless Sensor Networks (WSNs) have found various applications, such as air
quality monitoring, environmental management, etc., [4, 5]. A WSN typically includes many battery-powered sensor nodes, monitoring several targets, and
sending sensed data to a base station for further processing. In the WSNs, it is
necessary to provide sufficient monitoring quality surrounding the targets (i.e.,
guarantee-ing target coverage). Moreover, the WSNs need to have adequate
capacity for the communication between the sensors and base station (i.e.,

ensuring connectivity) [6][7][8]. The target coverage and connectivity are severely
affected by the depletion of the battery on sensor nodes. When a node runs out of
battery, it becomes a dead node without sensing and communication capability,
damaging the whole network in consequence. Wireless Rechargeable Sensor
Networks (WRSNs) leverages the advan-tages of wireless power transferring
technology to solve that critical issue in WSNs. A WRSN uses a mobile charger
(MC) to wirelessly compensate for a rechargeable battery’s energy consumption on
a sensor node, aiming to guarantee both the target coverage and connectivity.
In a normal operation, the MC moves around the networks and performs charging strategies, which can be classified into the periodic [9][1][10][11][12] or ondemand charging [13][2][14][15] [16][17][18]. In the former, the MC, with a
predefined trajec-tory, stops at charging locations to charge the nearby sensors’
batteries. In the latter, the MC will move and charge upon receiving requests from
the sensors, which have the remaining energy below a threshold. The periodic
strategy is limited since it can-not adapt to the sensors’ energy consumption rate
dynamic. On the contrary, the on-demand charging approach potentially deals with
the uncertainty of the energy consumption rate. Since a sensor with a draining
battery triggers the on-demand op-eration, the MC’s charging strategy faces a new
time constraint challenge. The MC needs to handle two crucial issues: deciding the
next charging location and staying period at the location.
1


Although many, the existing on-demand charging schemes in the literature face two
serious problems. The first one is the consideration of the same role for the sen-sor
nodes in WRSNs. That is somewhat unrealistic since, intuitively, several sensors,
depending on their locations, significantly impact the target coverage and the connectivity than others. Hence, the existing charging schemes may enrich unnecessary
sensors’ power while letting necessary ones run out of energy, leading to charging
algorithms’ inefficiency. It is of great importance to take into account the target
coverage and connectivity simultaneously. The second problem is about the MC’s
charging amount, which is either a full capacity (of sensor battery) or a fixed amount of
energy. The former case may cause: 1) a long waiting time of other sensors stay-ing

near the charging location; 2) quick exhaustion of the MC’s energy. In contrast,
charging a too small amount to a node may lead to its lack of power to operate until the
next charging round. Therefore, the charging strategy should adjust the transferred
energy level dynamically following the network condition.

1.2

Thesis contributions

Motivated by the above, this thesis propose a novel on-demand charging scheme
for WRSN that assures the target coverage and connectivity and adjusts the energy
level charged to the sensors dynamically. My proposal, named Fuzzy Q-charging, aims
to maximize the network lifetime, which is the time until the first target is not monitored.
First, this work exploit Fuzzy logic in an optimization algorithm that determines the
optimal charging time at each charging location, aiming to maximize the numbers of
alive sensors and monitoring targets. Fuzzy logic is used to cope with network
dynamics by taking various network parameters into account during the determination
process of optimal charging time. Second, this thesis leverage the Q-learning technique
in a new algorithm that selects the next charging location to maximize the network
lifetime. The MC maintains a Q-table containing the charging locations’ Q-values
representing the charging locations’ goodness. The Q-values will be updated in a realtime manner whenever there is a new charging request from a sensor. I design the Qvalue to prioritize charging locations at which the MC can charge a node depending on
its critical role. After finishing tasks in one place, the MC chooses the next one, which
has the highest Q-value, and determines an optimal charging time. The main
contributions of the paper are as follows.

• This thesis propose a Fuzzy logic-based algorithm that determines the
energy level to be charged to the sensors. The energy level is adjusted
dynamically following the network condition.
• Based on the above algorithm, this thesis introduce a new method that optimizes the optimal charging time at each charging location. It considers sev2



eral parameters (i.e., remaining energy, energy consumption rate, sensorto-charging location’s distance) to maximize the number of alive sensors.
• This thesis propose Fuzzy Q-charging, which uses Q-learning in its
charging scheme to guarantee the target coverage and connectivity. Fuzzy
Q-charging’s reward function is designed to maximize the charged amount
to essential sen-sors and the number of monitored targets.

1.3

Thesis structure

The rest of this thesis is constructed as follows.
• Chapter 3 introduces the related knowledge of this study, including the
overview of the WSN and the WRSN, the previous works of the charging
scheme opti-mization problem and some optimization algorithms.
• Chapter 2 describes concepts related to wireless sensor networks, qlearning, and fuzzy logic.
• Chapter 4 presents the proposed algorithms, which are comprised of the
fuzzy logic Q-learning approach.
• Chapter 5 evaluates and compares the performance of the proposed
algorithms with existing research.
• Chapter 5.2.4 concludes the thesis and discusses about future works.

3


Chapter 2
Theoretical Basis
2.1

Wireless Rechargeable Sensor Networks


A Wireless Sensor Network (WSN) is a network that consists of several spatially
distributed and specialized sensors connected by a communications infrastructure to
monitor and record physical conditions in a variety of situations. The sensors in
WRSNs will collectively convey the sensing data to the Base Station (BS), also known
as a sink, where it will be gathered, processed, and/or multiple actions done as
needed. A typical WSN connecting with end-users is seen in Fig. 2.1.

Sensors play an important role in a sensor network. To monitor the physical
environments and communicate with others efficiently, sensors have a lot of requirements. They not only need to record surroundings accurately and
precisely, be capable of computing, analyzing, and storing the sensing data, but
also have to be small in space, low in cost, and effective in power consumption.
Sensors commonly comprise four fundamental units: a sensing unit, which
mon-itors environments and converts the analog signal into a digital signal; a
processing unit, which processes the digital data and stores in memory; a
transceiver unit, which provides communication capability; and a power unit, which
supplies energy to the sensor [1]. In addition, some sensors also have a navigation
system to deter-mine positions of themselves, other sensors and sensing targets, a
mobilizer to add the mobile capability, etc. A sensor structure is shown in Fig. 2.2.
WSNs have a wide range of applications in a variety of fields. They were first
deployed in the 1950s as part of a sound surveillance system designed by the US
Navy to detect and track Soviet submarines. WSNs are now used in a variety of
civilian applications, including environmental monitoring, health monitoring, smart
agriculture, and so on. A WSN can be used in a forest, for example, to alert authorities to the risk of a forest fire. Furthermore, WSN can track the location of a fire
before it has a chance to expand out of control. WSNs have a lot of potential in the
area of health monitoring. Scientists, for example, have developed a WSN-based
sugar monitoring device. The system can record the fluctuation rate of glucose in
4



Figure 2.1: A wireless sensor network

Figure 2.2: A sensor structure
diabetes patients’ blood and warn them in time.
Despite its many benefits, a WSN has several limits. Because a WSN must
maintain its low-cost characteristic, some features must be eliminated. A sensor in
a WSN, for example, has a low-capacity battery that is often non-rechargeable. Battery replacements are impossible to obtain if WSNs are installed in remote terrain.
When a sensor’s battery dies, it can no longer record, send data, or communicate,
causing the network to malfunction. Furthermore, WSN sensors are tiny, resulting
in limited computing and storage capabilities. Although there are many challenges
with WSNs such as communicating range, signal attenuation, security, and so on,
this thesis focuses on the energy difficulty as one of the most important. As a result, the sensor nodes’ energy depletion avoidance has gotten a lot of interest from
researchers and network users all around the world.
5


Figure 2.3: Network model
Many efforts have been made to reduce the energy usage of WSNs. They have
attempted to optimize radio signals using cognitive radio standardization, lower
data rate using data aggregation, save more energy using sleep/wake-up schemes,
and pick efficient energy routing protocols. However, none of them completely
solved the energy problem of the sensor node in WSNs. The battery will ultimately
run out if there is no external source of electricity for the sensors. Gathering energy
from the environment is another way to overcome the sensor’s energy depletion
problem. Each sensor has an energy harvester that converts power from external
sources such as solar, thermal, wind, kinetic, and other forms of energy into
electrical power. Sensors can use the converted power to recharge their batteries,
extending the network’s lifespan. However, this strategy is overly reliant on an
unstable and uncontrolled ambient energy supply.
In recent years, thanks to advancements in wireless energy transfer and rechargeable battery technology, a recharging device can be used to recharge the battery of

sensors in WSNs. As a result, WRSNs, a new generation of sensor networks, was born
(Fig. 2.3). The sensor nodes in WRSNs are equipped with a wireless energy receiver
via wireless transfer radio waves based on electromagnetic radiation and magnetic
resonant coupling technology, giving them an edge over standard WSNs. WRSNs use
one or more chargers to recharge sensor nodes on a regular basis. As a result, the
lifetime of the network is optimally prolonged for eternal operations. WRSNs are easier
to maintain long-term and reliable operations than typical WSNs because they give a
more flexible, customizable, and dependable energy replenishment.
A new network generation, on the other hand, would provide new issues that have
never been faced before. WRSNs, in particular, necessitate a charger employment
approach. Charging terminals and MCs are the two types of chargers available. A
charging terminal is a device that has a fixed placement in the network and can

6


recharge many sensors. Because the network scale is normally large, a
significant number of charging terminals would be required. The network is
trying to figure out how many charging terminals are needed to refresh all of the
sensors, which is analogous to the coverage issue with WSNs. Furthermore, a
charging terminal does not have an infinite energy supply, thus it will need to be
recharged after some time. As a result, the charging terminal is not a
dependable device for long-term operation. It’s a better idea to use MCs to
recharge sensors. A MC can travel via the network, allowing it to cover a large
region. If it runs out of power, it will return to the BS to replenish its battery. As a
result, the only issue is that we need to figure out the MC’s charging structure.

2.2

Q-learning


Q-Learning [19] is one of the most often used Reinforcement Learning (RL) algorithms. It learns to predict the quality, in terms of expected cumulative reward, of
an action in a specific state (Q-value) [19]. Moreover, as it is a model-free reinforcement learning algorithm [20], [18], the agent does not have a model representation
of the environment, it simply learns and acts without knowing the changes being
caused in the environment. The methods in which an environment model is known
are called model-based. In this case, the agent knows approximately how the environment is going to evolve. This is the reason why model-based methods focus on
planning while model-free ones focus on learning [19].

The standard Q-learning framework consists of four components: an
environ-ment, one or more agents, a state space, and an action space, as
shown in Fig. 2.4. The Q-value represents the approximate goodness of the
action concerning the agent’s goal. An agent chooses actions according to the
policy and the Q-value. After performing an action, the agent modifies its policy
to attain its goal. The Q-value is updated using the Bellman equation as follows:
Q(St; At) (1

(2.1)

)Q(St; At) + [Rt + max Q(St+1; a)];
a

where Q(St; At) is the Q-value of action A t at a given sate St. Rt is the reward
obtained if performing action At where in the state St. Moreover, max Q(St+1; a)
a

is the maximum possible Q-value in the next state S t+1 for all possible actions a.
and are the learning rate and the future reward discount factor. Their values are
set between 0 and 1.
An explicit procedure to implement the Q-learning algorithm is provided in
Algorithm 1.


7


Figure 2.4: Q-learning overview
Algorithm 1: Q-Learning Algorithm
Initialize Q(s; a);
for each episode do
3
Get initial state s;
4
Select a using policy derived from Q;
0
5
Take action a, observe next state s and obtain reward r;
6
Update Q(s; a) by equation 2.1;
7
0
s
s;
8 end
1
2

2.3

Fuzzy Logic

Nowadays, decision-making is a daily activity of our lives. However, it is hard to

decide if a statement is certainly true or false in some cases. In such situations,
fuzzy logic can be used as a flexible method for reasoning, given the uncertainty.

In logic Boolean, a classic logical statement is a declarative sentence that
deliv-ers factual information. If the information is correct, the statement is true; if
the information is erroneous, the statement is false. However, sometimes, true
or false values are not enough.
Lotfi et al. [7] coined the term "fuzzy logic" in the 1960s to describe a type of
logic processing that contains more than two true values. The fact that some
assertions contain imprecise or non-numerical information influences fuzzy
logic. The term "fuzzy" was also used to describe ambiguity and unclear
information. As a result, fuzzy logic can describe and manipulate ambiguous
and uncertain data, and it has been used in a variety of industries.
Following the fuzzy method, fuzzy logic uses particular input values, such as
multi-numeric values or linguistic variables, to produce a specific output. The
fuzzy technique will determine if an object fully or partially contains a property,
even if the property is ambiguous. For example, the term "extremely strong
engine" is based on the fuzzy method. There are hidden degrees of intensity
("very") of the trait in question ("strong").

8


A fuzzy logic system consists of three components: fuzzification, fuzzy logic
con-troller, and defuzzification. The first component converts the crisp values of
the variable into their fuzzy form using some membership functions. The second
one is responsible for simulating the human reasoning process by making fuzzy
inference based on inputs and a set of defined IF-THEN rules. The module itself
can be separated into two subcomponents, namely Knowledge Base and
Inference Engine. Knowledge Base is a set of specifically designed rules so that

together with the in-put states of variables, they will produce consistent results.
Each rule’s form is "IF {set of input} THEN {output}". More explicitly, a fuzzy rule
Ri with k-inputs and 1-output has the following form.
Ri : IF (I1 is Ai1) (I2 is Ai2)

: : : (Ik is Aik)

(2.2)

THEN (O is Bi);

where fI1; : : : ; I kg represents the crisp inputs to the rule. fA i1; : : : ; Aikg and Bi
are linguistic variables. The operator can be AND, OR, or NOT. Inference
Engine is in charge of the estimation of the Fuzzy output set. It calculates the
membership degree ( ) of the output for all linguistic variables by applying the
rule set described in Knowledge Base. For Fuzzy rules with lots of inputs, the
output calculation depends on the operators used inside it, i.e., AND, OR, or
NOT. The calculation for each type of operator is described as follows:
(Ii is Ai AND Ij is Aj) :
Ai\Aj (Iij)

= min ( Ai (Ii); Aj (Ij)); (Ii is

Ai OR Ij is Aj) :
Ai[Aj (Iij)

= max ( Ai (Ii); Aj (Ij));

(NOT Ii is Ai) :
(Ii) = 1

Ai

A

i

(Ii):

The last component helps to convert the fuzzy output set from the linguistic
variables into a crisp value. The most popular fuzzy solution is a methodology
called the centroid technique, described as follows:
Center of Gravity of B (CoGB) = R
R

where

B(z)

+1
+1

B(z)zdz

is the output membership function of the linguistic variable B.

9

B(z)dz

(2.3)

;


Chapter 3
Literature Review
3.1

Related Work

Initially, This thesis introduces the existing works related to periodic charging in
WRSNs. In [9], the authors leverage PSO and GA to propose a charging path
determination algorithm that minimizes the docking time during which the MC
recharges itself at the depot. [1] jointly considers charging path planning and depot
positioning to minimize the number of MCs while ensuring no sensor runs out of
energy before being recharged. The work in [10] determines a charging path to
maximize the MC’s accumulative charging utility gain or minimize the MC’s energy
consumption during traveling. The authors then propose approximation algorithms
with constant ratios for the maximization and minimization problems. Arguing that
an MC can not fulfill all sensors’ demand in dense networks, W. Xu et al. in [11]
introduce a multi-chargers approximation model to increase the charging speed. In
[12], C. Lin et al. derive a new energy transfer model with distance and angle
factors. They also consider the problem of minimizing the total charging delay for all
nodes. They use linear programming and obtain the optimal solution. As the
charging schedule is always fixed, the periodic scheme fails to adapt to the
dynamic of sensors’ energy consumption.
Regarding the on-demand charging, the authors in [17] address the node failure
problem. They first propose to choose the next charging node based on the charging
probability. Second, they introduce a charging node selected method to minimize the
number of other requesting nodes suffering from energy depletion. In [2, 14], aiming to
maximize the charging throughput, they propose a double warning threshold charging

scheme. Two dynamic warning thresholds are triggered depending on the residual
energy of sensors. The authors in [18] studied how to optimize the serving order of the
charging requests waiting in the queue using the gravitational search algorithm. In [16],
X. Cao et al. introduce a new metric (i.e., charging reward), which quantifies the
charging scheme’s quality. The authors then address the problem of maximizing the
total reward in each charging tour under the constraint of the MC’s
10


energy and sensors’ charging time windows. They use a deep reinforcement
learning-based on-demand charging algorithm to solve the addressed problem.
The existing charging algorithms have two serious problems. First, the charging
time problem has not been thoroughly considered. Most of the charging schemes
leverage either the fully charging approach [1, 2, 9, 10, 13, 14, 17], or the partial
charging one [21]. I want to emphasize that the charging time is an essential factor
that decides how much the charging algorithm can prolong the network lifetime.
Moreover, there is no existing work considering the target coverage and
connectivity constraints concurrently. Most previous works treat all sensors in
WRSNs evenly; hence, the MC may charge unnecessary sensors while necessary
ones may run out of energy. Unlike them, this work addresses the target coverage
and connectivity constraints in charging schedule optimization. This thesis uniquely
considers the optimization of charging time and charging location simultaneously. I
use Fuzzy logic and Q-learning in my proposal.
Fuzzy logic has been applied in many fields such as signal processing [22, 23],
robotics [24], embedded controllers [25]. In WSNs, Fuzzy logic is a promising technique in dealing with various problems, including localization, routing [26, 27], clustering [19], and data aggregation [28, 29]. R. M. Al-Kiyumi et al. in [26] propose a Fuzzy
logic-based routing for lifetime enhancement in WSNs, which maps network status into
corresponding cost values to calculate the shortest path. In [20], the authors also
leverage Fuzzy logic and Q-learning, but in a cooperative multi-agent system for
controlling the energy of a microgrid. In [30], Fuzzy and Q-learning are combined to
address the problem of thermal unit commitment. Specifically, each in-put state vector

is mapped with the Fuzzy rules to determine all the possible actions with corresponding
Q-values. The main idea is exploiting Fuzzy logic to map net-work status into
corresponding cost values to calculate the shortest path. Recently, the authors in [15]
use Fuzzy logic in an algorithm for adaptively determining the charging threshold and
deciding the charging schedule. Different from the others, I use Fuzzy logic and Qlearning in my unique Fuzzy Q-charging proposal. The earlier version of this work has
been published in [31], which considers only Q-charging.

3.2

Problem definition

Figure 3.1 shows the considered network model, in which a WRSN monitors
several targets. The network has three main components: an MC, sensor nodes, and a
base station. The MC is a robot that can move and carry a wireless power charger. The
sensor nodes can receive charged energy from the MC via a wireless medium. The
base station is static and responsible for gathering sensing information. We assume
that there are n sensors S j (j = 1; : : : ; n) and m targets T k (k = 1; : : : ; m). We call a
sensor a target-covering sensor if it covers at least one target. Moreover,

11


Figure 3.1: Network model
if there exists an alive routing path between a sensor and the base station, it is
connected to the base station. The target is defined as to be monitored when at
least one sensor connected to the base station covers it.
A sensor node that has its remaining energy below E th (i.e., a predefined
thresh-old) will send a charging request to the MC. We target a non-preemptive
charging schedule, in which charging requests from sensors are queued at the MC.
We as-sume that there are k charging locations denoted by D 1; : : : ; Dk in the

network. When the MC completes its tasks at a charging location, it runs the
proposed al-gorithm to select the next optimal charging location from D 1; : : : ; Dk.
Moreover, the MC also determines the optimal charging time at that charging
location. When the energy of the MC goes below a threshold, it returns to the depot
to recharge itself. Besides gathering the sensing information, the base station is
also responsible for collecting information about the remaining energy sensors.
Based on that, the MC estimates every sensor’s energy consumption rate using the
weighted averag-ing method. Given all sensors and the targets’ locations, the ondemand charging algorithm aims to maximize the number of monitored targets.

12



×