Artificial Neural Network Identification And Control Of The Inverted Pendulum

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (564.45 KB, 82 trang )

Artificial Neural Network
identification and control
of the inverted pendulum

Tim Callinan
August 2003

Acknowledgements

I would like to thank my supervisor Jennifer Bruton for her help, guidance and support
throughout the project.
Thank you to Conor Maguire for helping me with the inverted pendulum rig and lending me
many of his manuals and books.
Thank you to Anthony Holohan for allowing me to experiment on the inverted pendulum rig.

Declaration
I hereby declare that, except where otherwise indicated, this document is entirely my own
work and has not been submitted in whole or in part to any other university.

Signed:…………………………………………………….

Date:………………………
2

Abstract
This project takes the area of Artificial Neural Networks (ANN) and applies it to the inverted
pendulum control problem. The inverted pendulum is typically used to benchmark new control
techniques, as it’s a highly non-linear unstable system. Neural networks have unique
characteristics, which enable them to control non-linear systems. Feedforward and Recurrent

neural networks are used to model the inverted pendulum. Multi-output online identification
was also researched. A neuro-controller for the inverted pendulum was developed. Traditional
control methods were utilized to develop a control law to stabilize the inverted pendulum. A
feedforward network was trained to mimic the control law. Tbe neuro-control shows that if a
disturbance occurs in the system, the neural network learns to counteract this disturbance.
Finally the knowledge learned in identification and control was applied to the real time
inverted pendulum rig. An online adaptive neural network was developed to model the real
time system.

3

Table of Contents
1

Introduction ..........................................................................................5
Outline of the document ..........................................................................6

2

Inverted Pendulum...............................................................................7

3

Artificial Neural Networks ...................................................................18
Advantages of ANN’s ............................................................................19
Types of Learning .................................................................................20
Neural network structures .....................................................................20
Multi-layered perceptrons......................................................................22

4

System Identification ..........................................................................24
System identification procedure............................................................25
Linear identification of the system.........................................................28
Non-Linear identification of the system.................................................36
Non-linear Identification using neural networks ....................................37
Multi-output identification ......................................................................46

5

Neural control of the inverted pendulum............................................51
Neural-control in simulink:.....................................................................58

6

Real-time identification and control....................................................67

7

Conclusions........................................................................................75
Summary ...............................................................................................75
Scope for future work ............................................................................79

8

Bibliography .......................................................................................80

4

1 Introduction
The process used in this project is the inverted pendulum system. The inverted pendulum is a
highly nonlinear and open-loop unstable system. This means that standard linear techniques
cannot model the nonlinear dynamics of the system. When the system is simulated the
pendulum falls over quickly. The characteristics of the inverted pendulum make identification
and control more challenging. There are two main aims of the project. The first is to develop
an accurate model of the inverted pendulum system using neural networks. The second aim is
to develop a neural network controller which determines the correct control action to stabilize
the system, but can also learn from experience.

System identification is the procedure that develops models of a dynamic system based on the
input and output signals from the system. The input and output data must show some of the
dynamics of the process. The parameters of the model are adjusted until the output from the
model is similar to the output of the real system. In order to develop an accurate model of the
inverted pendulum, different methods (linear and nonlinear) of identification will be tested.
One of the problems encountered early in the project is collecting experimental data from the
inverted pendulum system. The output data from the unstable system does not show enough
information or dynamics of the system. Feedback controllers are developed which stabilize the
system before identification can take place.

Neural networks have shown great progress in identification of nonlinear systems. There are
certain characteristics in ANN which assist them in identifying complex nonlinear systems.
ANN are made up of many nonlinear elements and this gives them an advantage over linear
techniques in modelling nonlinear systems. ANN are trained by adaptive learning, the network
‘learns’ how to do tasks, perform functions based on the data given for training. The
knowledge learned during training is stored in the synaptic weights. The standard ANN
structures (feedforward and recurrent) are both used to model the inverted pendulum.

5

The main task of this project is to design a neural network controller which keeps the
pendulum system stabilized. There are 3 main types of neural control – supervised, direct
inverse and unsupervised.
Supervised learning uses an existing controller or human feedback in training the neural
network. In order to train the neural network to imitate an existing controller a vector of inputs
and control targets from the controller must be collected. With supervised control, a neural
network could be trained to imitate a robust controller. The robust controller can operate
correctly, if the process operates around a certain point. The neuro-controller operates
similarly to the robust controller but can also adapt if any disturbance occurs in the system.
Direct inverse control does not require an existing controller in training. A neural network is
trained to model the inverse of the process. The neural network is cascaded with the process.
Theoretically if the inverse model is very accurate, the nonlinearities in the ANN will cancel
out the nonlinearities in the process.

Outline of the document
Chapter 2 details the research on the inverted pendulum system. The dynamic system
equations (linear and nonlinear) are derived. The simulink models of the linear and nonlinear
systems are developed. The development of the feedback controllers to stabilize the system is
also discussed. Chapter 3 covers the theory, structure and operation of artificial neural
networks. Chapter 4 covers the whole area of system identification. The procedure of system
identification is discussed first. Linear identification techniques are applied to the linear
system. Nonlinear identification using neural networks is then reported. Chapter 5 details the
development of the neuro-controller. Chapter 6 discusses the real time identification and
control using the inverted pendulum rig. Finally Chapter 7 provides a summary of the work
discussion of the results and scope for future work.

6

2 Inverted Pendulum
The inverted pendulum system is a classic control problem that is used in universities around
the world. It is a suitable process to test prototype controllers due to its high non-linearities
and lack of stability. The system consists of an inverted pole hinged on a cart which is free to
move in the x direction. In this chapter, the dynamical equations of the system will be derived,
the model will be developed in simulink and basic controllers will be developed. The aim of
developing an inverted pendulum in simulink is that the developed model will have the same
characteristics as the actual process. It will be possible to test each of the prototype controllers
in the simulink environment. Before the inverted pendulum model can be developed in
simulink, the system dynamical equations will be derived using ‘Lagrange Equations’. [1] The
Lagrangian equations are one of many methods of determining the system equations. Using
this method it is possible to derive dynamical system equations for a complicated mechanical
system such as the inverted pendulum. Figure. 1 is a free-bodied diagram of the pendulum
system.

M – Mass of the cart
m – mass of the pole
l – length of the pole
f – control force

Fig. 1: Free body diagram of the inverted pendulum system

The Lagrange equations use the kinetic and potential energy in the system to determine the
dynamical equations of the cart-pole system.

7

The kinetic energy of the system is the sum of the kinetic energies of each mass. The kinetic

energy, T1 of the cart is
•
1
T1 = M y 2
2

(Eq.1)

The pole can move in both the horizontal and vertical directions so the pole kinetic energy is
•
•
1
2
2
T2 = m( y 2 + z2 )
2

(Eq.2)

From the free body diagram y2 and z 2 are equal to
•

•

y 2 = y + l sin θ

(Eq.3)

z 2 = −l θ sinθ

z 2 = l cosθ

(Eq.5)

y 2 = y + l θ cosθ

•

•

(Eq.4)

•

(Eq.6)

The total kinetic energy, T of the system is equal to

T = T1 + T2 =

•
•
•
1
2
2 
2
M
y
m

y
z
(
+
+
2
2 )

2


(Eq.7)

Equation 3 and 5 are inputted into equation 7 to give equation 8.

T=

•
• •

1
1  •
M y 2 + m  y 2 + 2 y θ l cosθ + l 2θ 2 
2
2 


(Eq.8)

The potential energy, V of the system is stored in the pendulum so

V = mgz 2 = mgl cosθ

(Eq.9)

The Lagrangian function is
•
•
• •
1
1 2 2
2
L = T − V = ( M + m ) y + ml cosθ y θ + ml θ − mgl cosθ
2
2

(Eq.10)

8

The state-space variables of the system are y andθ , so the Lagrange equations are
d  ∂L  ∂L
=0
−
dt  ∂ y• 

 ∂y

d  ∂L  ∂L

=0
−
dt  ∂ θ• 
 ∂θ


(Eq.11)

(Eq.12)

But,
∂L

•

•

∂y
∂L

•

= ( M + m) y + ml cos θ θ

(Eq.13)

=0

(Eq.14)

∂y
∂L

•

•

∂θ
∂L

•

= ml cos θ y + ml 2 θ

(Eq.15)

= mgl sin θ

(Eq.16)

∂θ

The above equations (Eq. 13-16) are inputted into the Lagrange equations (Eq. 11-12) and this
results in the non-linear dynamical equations for the inverted pendulum system, which are
shown below.

( M + m) &y& + ml cos θθ&& − mlθ& 2 sin θ = f

(Eq.17)

ml cos θ . &y& − ml sin θ . y& θ& + ml 2θ&& − mgl sin θ = 0

(Eq.18)

Some of the modelling and control techniques involved in the project are linear so these
equations must be linearized. It is possible to linearize these equations by approximating
cosθ =1 and sin θ =0. It is assumed that θ is kept small. The quadratic terms are also
negligible. Therefore the two linear system equations are

&y& =

f mg
−
θ
M M

f  M +m
θ&& = −
+
 gθ
Ml  Ml 

(Eq.19)

(Eq.20)

9

At this stage, a set of equations (linear & non-linear) describing the inverted pendulum have

been developed. The next stage is constructing a simulnk model of the inverted pendulum
system. There is no procedure for developing simulink models from dynamical state
equations. The diagram below is the linear pendulum model. This model is constructed using
integrators, gain blocks, etc. The model (Fig. 2) is simply a simulink representation of the
linear state equations.

Fig. 2 : Simulink model of the linear pendulum system

The non-linear pendulum system (Fig. 4) is shown in the next page. The non-linear system,
even though it is more complicated is developed in a similar manner. Both models are large so
it is possible to encapsulate them in subsystem blocks shown below (Fig. 3). Both the models
are set-up using a mask. The mask makes it possible to change the values of m, l, g, etc for
different simulations. The mass of the cart, M is set to 1.2 Kg, the mass of the pendulum is set
to 0.11 Kg and the length of the pendulum is 0.4 meters. These figures are taken from the real
time inverted pendulum rig.

10
Fig. 3 : Simulink blocks of the pendulum systems

The following simulink diagram is the non-linear pendulum model.

Fig. 4 : Simulink model of the nonlinear pendulum system

11

Both pendulum models are simulated in simulink. The angle of the pendulum is shown below
(Fig. 5). The simulation shows that the pendulum goes unstable and falls over.
0

-10
-20
-30
-40

Pendulum
angle, deg.

-50
-60
-70
-80
-90
-100

0

20

40

60

80

100

120

Time

Fig. 5: Open loop response of the inverted pendulum

One of the requirments in system identification is the collection of ‘information rich’
input/output data. The graph above (Fig. 5) of the pendulum angle does not give us enough
information on the pendulum system.- The pendulum falls over too quickly. In order to
adequatly model the inverted pendulum it is necessary to stabilize it using a feedback
controller. Using a feedback controller, the output data will contain more information
describing the process. [2]

12

A full state feedback controller is developed to stabilize the linear pendulum system. The
linear system could have been stabilized using many different methods (PID,etc). The fullstate feedback controller stabilizes the system by positioning the closed loop poles in the
stable region. The simulink model with controller is shown below. (Fig.6)

Fig. 6 : Simulink diagram of Linear Pendulum and controller

The linear pendulum system is simulated, the angle of the pendulum is shown below. (Fig.7)
The stabilized system with controller keeps the pendulum angle stable. The pendulum can be
simulated for longer times. The data is also of better quality for system identification
purposes.
0.8
0.6
0.4
0.2

Pendulum
angle, deg.

0
-0.2
-0.4
-0.6
-0.8

0

50

100

150

200

250

300

350

Time
Fig. 7: Closed loop response of the inverted pendulum with controller

13

Developing a controller for the non-linear pendulum is more difficult. Linear control
techniques such as the PID, full-state feedback were tested but had no success in controlling

the non-linear pendulum. A feedback linearisation controller was developed to control the
non-linear pendulum system. Feedback linearisation cancels the non-linearities in the
pendulum system so that the closed loop system is more linear.

The following equations are a control law developed for the inverted pendulum controller. The
first four equations (Eq. 21-24) are entered into the main equation. The main equation (Eq. 25)
calculates the required force, U to keep the pendulum stable.

h1 =

3
g sin θ
4l



•2
•


3
f 1 = m l sin θ θ − g sin 2θ  − f x
8





(Eq.21)

(Eq.23)

h2 =

3
cos θ
4l

(Eq.22)

 3

f 2 = M + m1 − cos 2 θ 
 4


•

f
u = 2 h1 + k1 (θ − θ d ) + k 2 θ + c1 ( x − x d ) + c2
h2 



x  − f1



(Eq.24)

•

(Eq.25)

For the simulations M, m, l, g are set to the values of the pendulum model. The following
numeric values are used: M = 1.2 Kg, m = 0.1 Kg, l = 0.4 m, g = 9.81 m/s, k1 = 25, k2 = 10,

C1 = 1, C2 = 2.6. Also xd = 0 meters and θ d = 0 rad, which are the desired position of the cart
and angle of the pendulum respectively. For details on all the parameters see [4]. A simulink
model of the above control law was developed and is shown in Figure 8.

14

The inputs to this controller are the 4 output states of the non-linear pendulum model. The
correct magnitude and force to keep the pendulum stable is calculated by the control law.

Fig. 8 : Simulink model of the nonlinear control law.

15

The following diagram (Fig.9) shows the set-up of the non-linear pendulum with control law.

Fig. 9: Simulink diagram of the nonlinear system with control law.

Figure 10 is the closed loop pendulum angle plotted by matlab. The closed loop response is
stable and shows that the control law is working.
-3

4

x 10

3
2
1

Pendulum
angle, deg.

0
-1
-2
-3
-4
-5

0

100

200

300

400

500

600

Time
Fig. 10: Closed loop response of the nonlinear pendulum with controller

16

The linear and nonlinear models of the cart-pole system have been developed and simulated. It
was found that the system is open loop unstable. For accurate system identification the process
must be stable, because of this, standard feedback controllers were developed and tested. The
next chapter in the report discusses the theory and operation of artificial neural networks.

17

3 Artificial Neural Netw orks
The science of artificial neural networks is based on the neuron. In order to understand the
structure of artificial networks, the basic elements of the neuron should be understood.
Neurons are the fundamental elements in the central nervous system. The diagram below (Fig.
11) shows the components of a neuron. [5]

Fig. 11: The diagram shows the basic elements of a neuron

A neuron is made up of 3 main parts -dendrites, cell body and axon. The dendrites receive
signals coming from the neighbouring neurons. The dendrites send their signals to the body of
the cell. The cell body contains the nucleus of the neuron. If the sum of the received signals is
greater than a threshold value, the neuron fires by sending an electrical pulse along the axon to
the next neuron. The following model is based on the components of the biological neuron
(Fig. 12). The inputs X0-X3 represent the dendrites. Each input is multiplied by weights W0W3. The output of the neuron model, Y is a function, F of the summation of the input signals.

Fig. 12: Diagram of neuron model

18

Advantages of ANN’s
1.

The main advantage of neural networks is that it is possible to train a neural network to
perform a particular function by adjusting the values of connections (weights) between
elements. For example, if we wanted to train a neuron model to approximate a specific
function, the weights that multiply each input signal will be updated until the output
from the neuron is similar to the function.

2.

Neural networks are composed of elements operating in parallel. Parallel processing
allows increased speed of calculation compared to slower sequential processing.

Fig. 13: Diagram shows the parallelism of neural networks

3.

Artificial neural networks (ANN) have memory. The memory in neural networks
corresponds to the weights in the neurons. Neural networks can be trained offline and
then transferred into a process where adaptive learning takes place. In our case, a neural
network controller could be trained to control an inverted pendulum system offline say
in the simulink environment. After training, the network weights are set. The ANN is
placed in a feedback loop with the actual process. The network will adapt the weights to

improve performance as it controls the pendulum system.

The main disadvantage of ANN is they operate as black boxes. The rules of operation in
neural networks are completely unknown. It is not possible to convert the neural structure into
known model structures such as ARMAX, etc. Another disadvantage is the amount of time
taken to train networks. It can take considerable time to train an ANN for certain functions.
19

Types of Learning
Neural networks have 3 main modes of operation – supervised, reinforced and unsupervised
learning. [6] In supervised learning the output from the neural network is compared with a set
of targets, the error signal is used to update the weights in the neural network. Reinforced
learning is similar to supervised learning however there are no targets given, the algorithm is
given a grade of the ANN performance. Unsupervised learning updates the weights based on
the input data only. The ANN learns to cluster different input patterns into different classes.

Neural network structures
There are 3 main types of ANN structures -single layer feedforward network, multi-layer
feedforward network and recurrent networks. [7] The most common type of single layer
feedforward network is the perceptron. Other types of single layer networks are based on the
perceptron model. The details of the perceptron are shown below (Fig. 14).

x0

x1
x2

Fig. 14: Diagram of the perceptron model

Inputs to the perceptron are individually weighted and then summed. The perceptron computes
the output as a function F of the sum. The activation function, F is needed to introduce nonlinearities into the network. This makes multi-layer networks powerful in representing
nonlinear functions.

20

There are 3 main types of activation function -tan-sigmoid, log-sigmoid and linear. [8]
Different activation functions affect the performance of an ANN.

Log-sigmoid function

Tan-sigmoid function

Linear function

The output from the perceptron is

y[ k ] = f ( wT [k ]. x[k ])

(Eq.26)

The weights are dynamically updated using the back propagation algorithm. The difference
between the target output and the actual output (error) is calculated.

e[k ] = T [ k ] − y[k ]

(Eq.27)

The errors are back propagated through the layers and the weight changes are made. The

formula for adjusting the weights is

w[k + 1] = w[k ] + µ .e[k ]. x[k ]

(Eq.28)

Once the weights are adjusted, the feed-forward process is repeated. The weights are adapted
until the error between the target and actual output is low. The approximation of the function
improves as the error decreases. Single-layer feedforward networks are useful when the data
to be trained is linearly separable. If the data we are trying to model is not linearly separable or
the function has complex mappings, the simple perceptron will have trouble trying to model
the function adequately.

21

Multi-layered perceptrons
Neural networks can have several layers. There are 2 main types of multi-layer networksfeedforward and recurrent. In feedforward networks the direction of signals is from input to
output, there is no feedback in the layers. The diagram below (Fig. 15) shows a 3-layered
feedforward network.
Hidden la yer
Output layer

Input layer

Fig. 15: Diagram of a multi-layered perceptron

Increasing the number of neurons in the hidden layer or adding more hidden layers to the
network allows the network to deal with more complex functions. Cybenko’s theorem states
that, “A feedforward neural network with a sufficiently large number of hidden neurons with

continuous and differentiable transfer functions can approximate any continuous function over
a closed interval.” [9] The weights in MLP’s are updated using the backpropagation learning.
[10] There are two passes before the weights are updated.
In the first pass (forward pass) the outputs of all neurons are calculated by multiplying the
input vector by the weights. The error is calculated for each of the output layer neurons.
In the backward pass, the error is passed back through the network layer by layer. The weights
are adjusted according to the gradient decent rule, so that the actual output of the MLP moves
closer to the desired output. A momentum term could be added which increases the learning
rate with stability.
22

The second type of multi-layer networks are recurrent (Fig.16). Recurrent networks have at
least one feedback loop. This means an output of a layer feeds back to any proceeding layer.
Hidden la ye r

Input la ye r

Output la yer

Fig. 16: Diagram of a recurrent neural network

This gives the network partial memory due to the fact that the hidden layer receives data at
time t but also at time t-1. This makes recurrent networks powerful in approximating functions
depending on time. [11] The simulink model for the nonlinear inverted pendulum shows that
there are many feedback loops. This means the next state of the model depends on previous
states. It is expected that to accurately model this type of dynamic system, a recurrent neural
network with feedback loops will perform better than a static feedforward network.

23

4 System Identification
System identification is the process of developing a mathematical model of a dynamic system
based on the input and output data from the actual process. [12] This means it is possible to
sample the input and output signals of a system and using this data generate a mathematical
model. An important stage in control system design is the development of a mathematical
model of the system to be controlled. In order to develop a controller, it must be possible to
analyse the system to be controlled and this is done using a mathematical model. Another
advantage of system identification is evident if the process is changed or modified. System
identification allows the real system to be altered without having to calculate the dynamical
equations and model the parameters again.

System identification is concerned with developing models. The diagram below (Fig. 17)
shows the inputs and output of a system.
DISTURBANCE

INPUT

OUTPUT

Fig. 17: System showing input, disturbance and output signals

The mathematical model in this case is the black box, it describes the relationship between the
input and output signals. The inverted pendulum system is a non-linear process. To adequately
model it, non-linear methods using neural networks must be used. Previous studies in system
identification have demonstrated that neural networks are successful in modelling many nonlinear systems. [13] Before neural networks are investigated for identification, linear
techniques such as auto regressive with exogenous input (ARX) and auto regressive moving
average with exogenous input (ARMAX) will be applied to the linear inverted pendulum
model.

24

System identification procedure
Basically system identification is achieved by adjusting the parameters of the model until the
model output is similar to the output of the real system. Below is a diagram (Fig. 18)
explaining the system identification procedure. [14]

Fig. 18: Diagram of the system identification procedure

There are three main steps in the system identification procedure.
1.

The first step is to generate some experimental input/output data from the process we
are trying to model. In the case of the inverted pendulum system this would be the
input force on the cart and the output pendulum angle.

2.

The next step is to choose a model structure to use. For example the following model
structure is the ARX.

A. y (t ) = Bu (t ) + e(t )
3.

(Eq.29)

The parameters A and B will be adjusted until this model output is similar to the output
of the process. In identification, there is no perfect model structure to use. Models can
be developed using engineering intuition or a priori knowledge of the process we are

trying to model.
25

Artificial Neural Network Identification And Control Of The Inverted Pendulum

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về