Tải bản đầy đủ (.pdf) (104 trang)

Framework for joint data reconciliation and parameter estimation

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (880.19 KB, 104 trang )

FRAMEWORK FOR JOINT DATA RECONCILIATION AND
PARAMETER ESTIMATION

JOE YEN YEN
(B.Eng.(Hons.), NUS)

A THESIS SUBMITTED
FOR THE DEGREE OF MASTER OF ENGINEERING
DEPARTMENT OF ELECTRICAL
AND COMPUTER ENGINEERING
NATIONAL UNIVERSITY OF SINGAPORE
2004


ACKNOWLEDGMENTS

I would like to express my gratitude to my supervisors Dr Arthur Tay and A/Professor
Ho Weng Khuen of the NUS ECE Department and Professor Ching Chi Bun of the
Institute of Chemical and Engineering Sciences (ICES), for their advice and guidance,
and for having confidence in me in carrying out this research.

The support of ICES in financing the research and providing the research environment is
greatly acknowledged. The guidance and assistance from Dr Liu Jun of ICES is also
appreciated.

Most of the research work is done in the Process Systems Engineering (PSE) Group of
the Department of Chemical Engineering in University of Sydney. The guidance of
Professor Jose Romagnoli and Dr David Wang is valuable in directing and improving the
quality of this work. I would also like to thank them for accommodating me with such
hospitality that my stay in the group is not only fruitful, but also enjoyable. Thanks also
go to the other members of the PSE for sharing their research knowledge and experience


and for making me feel part of the group.

My friend Zhao Sumin has seen to my well being in Sydney, and most importantly, has
been a like-minded confidante from whom I obtain inspiration in carrying out my
research work. I would like to dedicate this thesis to her, as with her persistent
encouragements, I feel that the effort in completing this thesis is partly hers.

i


TABLE OF CONTENTS

ACKNOWLEDGMENTS............................................................................................................................. I
TABLE OF CONTENTS.............................................................................................................................II
SUMMARY ................................................................................................................................................ IV
LIST OF TABLES ..................................................................................................................................... VI
LIST OF FIGURES ..................................................................................................................................VII
CHAPTER 1: INTRODUCTION ................................................................................................................1
1.1. MOTIVATION ........................................................................................................................................1
1.2. CONTRIBUTION .....................................................................................................................................3
1.3. THESIS ORGANIZATION ........................................................................................................................4
CHAPTER 2: THEORY & LITERATURE REVIEW..............................................................................6
2.2. DATA RECONCILIATION (DR)...............................................................................................................6
2.3. JOINT DATA RECONCILIATION – PARAMETER ESTIMATION (DRPE) ..................................................11
2.4. ROBUST ESTIMATION .........................................................................................................................15
2.5. PARTIALLY ADAPTIVE ESTIMATION ...................................................................................................22
2.6. CONCLUSION ......................................................................................................................................24
CHAPTER 3: JOINT DRPE BASED ON THE GENERALIZED T DISTRIBUTION .......................26
3.1. INTRODUCTION ...................................................................................................................................26
3.2. THE GENERALIZED T (GT) DISTRIBUTION..........................................................................................26

3.3. ROBUSTNESS OF THE GT DISTRIBUTION .............................................................................................29
3.4. PARTIALLY ADAPTIVE GT-BASED ESTIMATOR...................................................................................30
3.5. THE GENERAL ALGORITHM ................................................................................................................32
3.6. CONCLUSION ......................................................................................................................................35
CHAPTER 4: CASE STUDY .....................................................................................................................36
4.1. INTRODUCTION ...................................................................................................................................36
4.2. THE GENERAL-PURPOSE CHEMICAL PLANT .......................................................................................36
4.3. VARIABLES AND MEASUREMENTS ......................................................................................................40
4.4. SYSTEM DECOMPOSITION: VARIABLE CLASSIFICATION .....................................................................43
4.4.1. Derivation of Reduced Equations ..............................................................................................43
4.4.2. Classification of Measurements .................................................................................................45
4.5. DATA GENERATION ............................................................................................................................47
4.5.1. Simulation Data .........................................................................................................................47
4.5.2. Real Data ...................................................................................................................................48
4.6. METHODS COMPARED ........................................................................................................................48
4.7. PERFORMANCE MEASURES .................................................................................................................50
4.8. CONCLUSION ......................................................................................................................................51
CHAPTER 5: RESULTS & DISCUSSION ..............................................................................................53
5.1. INTRODUCTION ...................................................................................................................................53
5.2. PARTIAL ADAPTIVENESS & EFFICIENCY .............................................................................................53
5.3. EFFECT OF OUTLIERS ON EFFICIENCY .................................................................................................64
5.4. EFFECT OF DATA SIZE ON EFFICIENCY ...............................................................................................67
5.5. ESTIMATION OF ADAPTIVE ESTIMATOR PARAMETERS: PRELIMINARY ESTIMATORS AND ITERATIONS
..................................................................................................................................................................71
5.6. REAL DATA APPLICATION ..................................................................................................................76
5.7. CONCLUSION ......................................................................................................................................79

ii



CHAPTER 6: CONCLUSION ...................................................................................................................80
6.1. FINDINGS ............................................................................................................................................80
6.2. FUTURE WORKS .................................................................................................................................82
REFERENCES ............................................................................................................................................83
AUTHOR’S PUBLICATIONS...................................................................................................................85
APPENDIX A ..............................................................................................................................................86
APPENDIX B...............................................................................................................................................95

iii


SUMMARY

Objective knowledge about a process is essential for process monitoring, optimization,
identification and other general management planning. Since measurement of process
states always contain some type of errors, it is necessary to correct these measurement
data to obtain more accurate information about the process. Data reconciliation is such an
error correction procedure that utilizes estimation theory and the conservation laws
within the process to improve the accuracy of the measurement data and estimates the
values of unmeasured variables such that reliable and complete information about the
process is obtained.

Conventional data reconciliation, and other procedures that involve estimation, have
relied on the assumption that observation errors are normally distributed. The inevitable
presence of gross errors and outliers violates this assumption. In addition, the actual
underlying distribution is not known exactly and may not be normal. Various robust
approaches such as the M-estimators have been proposed, but most assumed, in a priori,
yet other forms of distribution, although with thicker tails than that of normal distribution
in order to suppress gross errors / outliers. To address the issue of the suitability of the
actual distribution to the assumed one, posteriori estimation of the actual distribution,

based on non-parametric methods such as kernel, wavelet and elliptical basis function, is
then proposed. However, these fully adaptive methods are complex and computationally
demanding. An alternative is to strike a balance between the simplicity of the parametric
approach and the flexibility of the non-parametric approach, i.e. by adopting a

iv


generalized objective function that covers a wide variety of distributions. The parameters
of the generalized distribution can be estimated posteriori to ensure its suitability to the
data.

This thesis proposes the use of a generalized distribution, namely the Generalized T (GT)
distribution in the joint estimation of process states and model parameters. The desirable
properties of the GT-based estimator are its robustness, simplicity, flexibility and
efficiency for the wide range of commonly encountered distributions (including Box-Tiao
and t-distributions) that belong to the GT distribution family. To achieve estimation
efficiency, the parameters of the GT distribution are adapted from the data through
preliminary estimation. The strategy is applied to data from both the virtual version and a
trial run of a chemical engineering pilot plant. The results confirm the robustness and
efficiency of the estimator.

v


LIST OF TABLES
Table 5.1. MSE of Measurements..................................................................................... 57
Table 5.2. Estimated parameters of the partially adaptive estimators used to generate
Figure 5.5 .................................................................................................................. 63
Table 5.3. MSE of Measurements with Outliers............................................................... 64

Table 5.4. Reconciled Data and Estimated Parameter Values Using Different DRPE
Methods..................................................................................................................... 78

vi


LIST OF FIGURES
Figure 2.1. Plots of Influence Function for Weighted Least Square Estimator (dashed
line) and the Robust Estimator based on Bivariate Normal Distribution (solid line)21
Figure 2.2 Partially Adaptive Estimation Scheme............................................................ 24
Figure 3.1. Plot of GT density functions for various settings of distribution parameters p
and q.......................................................................................................................... 27
Figure 3.2. GT Distribution Family Tree, Depicting the Relationships among Some
Special Cases of the GT Distribution........................................................................ 28
Figure 3.3. Plots of Influence Function for GT-based Estimator with different parameter
settings ...................................................................................................................... 30
Figure 3.4. General Algorithm for Joint DRPE using partially adaptive GT-based
estimator.................................................................................................................... 33
Figure 4.1. Flow Diagram of the General Purpose Plant for Application Case Study ..... 37
Figure 4.2. Simulink Model of the General Purpose Plant in Figure 4.1. ........................ 38
Figure 4.3. Configuration of the General Purpose Plant for Trial Run............................. 39
Figure 4.4. Reactor 1 Configuration Details and Measurements...................................... 39
Figure 4.5. Reactor 2 Configuration Details and Measurements...................................... 40
Figure 5.1. MSE Comparison of GT-based with Weighted Least Squares and
Contaminated Normal Estimators............................................................................. 57
Figure 5.2. Percentage of Relative MSE........................................................................... 58
Figure 5.3. Comparison of the Accuracy of Estimates for the overall heat transfer
coefficient of Reactor 1 cooling coil......................................................................... 60
Figure 5.4. Comparison of the Accuracy of Estimates for the overall heat transfer
coefficient of Reactor 2 cooling coil......................................................................... 61

Figure 5.5. Adaptation to Data: Fitting the Relative Frequency of Residuals with GT,
Contaminated Normal, and Normal distributions..................................................... 62
Figure 5.6. MSE of Variable Estimates for Data with Outliers ........................................ 66
Figure 5.7. Comparison of MSE with and without outliers for GT and Contaminated
Normal Estimators .................................................................................................... 66
Figure 5.8. MSE Results of WLS, Contaminated Normal and GT-based estoimators for
Different Data Sizes.................................................................................................. 68
Figure 5.9. Improvement in MSE Efficiency when Data Size is increased...................... 70
Figure 5.10. Iterative Joint DRPE with Preliminary Estimation ...................................... 72
Figure 5.11. Final MSE Comparison for GT-based DRPE method Using GT, Median and
WLS as preliminary estimators................................................................................. 74
Figure 5.12. MSE throughout iterations ........................................................................... 75
Figure 5.13. Scaled Histogram of Data and Density Plots of GT, Contaminated Normal
and Normal Distributions.......................................................................................... 77

vii


CHAPTER 1: INTRODUCTION

1.1. Motivation
The continuously increasing demand for higher product quality and stricter compliance to
environmental and safety regulations requires the performance of a process to be
continuously improved through process modifications (Romagnoli and Sanchez, 2000).
Decision making associated with these process modifications requires accurate and
objective knowledge of the process state. This knowledge of process state is obtained
from interpretation of data generated by the process control systems. The modern-day
Distributed Control System (DCS) is capable of high-frequency sampling, resulting in
vast amount of data to be interpreted, be it for the purpose of process monitoring,
optimization or other general management planning. Since measurement data always

contain some type of error, it is necessary to correct their values in order to obtain
accurate information about the process.

Data reconciliation (DR) is such an error-correction procedure that improves the accuracy
of measurement data, and estimates the values of unmeasured variables, such that reliable
and complete information about the process is obtained. It makes use of conservation
equations and other system/model equations to correct the measurement data, i.e. by
adjusting the measurements such that the adjusted data are consistent with respect to the
equations. The conventional data reconciliation approach is the least squares
minimization, whereby the (square of) adjustments to the measurements are minimized,

1


while at the same time subjecting the measurements to satisfy the system/model
equations. The least squares method is simple and reasonably efficient; in fact, it is the
best linear unbiased, the most efficient in terms of minimum variance, and also the
maximum likelihood estimator when the measurement errors are distributed according to
the Normal (Gaussian) distribution.

However, measurement error is made up of random and gross error. Gross errors are
often present in the measurements and these large deviations are not accounted for in the
normal distribution. In this case, the least squares method can produce heavily biased
estimates. Attempts to deal with gross errors can be grouped in two classes. The first
includes methods that still keep the least squares approach, but incorporate additional
statistical tests to the residuals of either the constraints (which can be done prereconciliation) or the measurements (which must be done post-reconciliation). The
drawback of these approaches is that there is a need for separate gross-error processing
step. Also, most importantly, normality is still assumed for the data, while the data may
not be best represented by the Normal distribution. Furthermore, the statistical tests are
theoretically valid only for linear system/model equations, which is a rather constricting

restriction in chemical processes where most relationships are nonlinear.

The second class of gross-error handling approaches comprises the more recent
approaches to suppress gross error, i.e. by making use of the so-called robust estimators.
These estimators can suppress gross error while performing reconciliation, so there is no
need for a separate procedure to remove the gross errors. Most of these approaches are

2


based on the concept of statistical robustness, and they can be further grouped as either
parametric or non-parametric approach. The parametric approach either represents the
data with a certain distribution that has thicker tails to account for gross errors, or uses a
certain form of estimator that does not assume normality and gives small weights for
largely deviating observations. The non-parametric group consists of estimators that do
not assume any fixed form of distributions, but adjust their forms to the data distribution
through non-parametric density estimation instead. The resulting estimator will be
efficient as it is fitted to the data. However, these fully flexible estimators are prone to the
data size for the preliminary fitting and often do not perform well for data size often
encountered in practice (Butler et al, 1990).

A strategy is proposed to improve the efficiency of the parametric estimation by allowing
the parameters of the estimator to vary to suit the data. This is called partially adaptive
estimation. In this thesis, a robust partially adaptive data reconciliation procedure using
the generalized-T (GT) distribution will be studied and applied to the virtual version of a
chemical engineering pilot plant. The strategy is extended to the joint DRPE, which gives
both parameter and variable estimates that are consistent with the system/model
equations.

1.2. Contribution

In this thesis, a robust and efficient strategy for joint data reconciliation and parameter
estimation is studied. The strategy makes use of the generalized T (GT) distribution, a
robust and versatile general distribution family that is originally proposed in the field of

3


statistics by McDonald and Newey (1988). The GT distribution is first used in data
reconciliation by Wang and Romagnoli (2003) in a comparison case study.

In the present work, the strategy is extended to incorporate parameter estimation in the
joint data reconciliation and parameter estimation scheme. The properties of the GTbased partially adaptive estimators are comprehensively studied through various
simulation cases.

A comprehensive literature review of the data reconciliation and joint data reconciliation
– parameter estimation, and the technical aspects associated with them is conducted.

As an application case study, the virtual version of a real lab-scale general-purpose
chemical plant is developed in Matlab/ Simulink. Besides simulation studies, steady-state
experimental data are also obtained from a trial run of the pilot plant. Full system
decomposition based on formal transformation methods and symbolic manipulation is
then conducted on the plant to facilitate accurate and complete estimation of the process
states and parameters by the joint data reconciliation – parameter estimation procedure.

1.3. Thesis Organization
This thesis is organized as follows. The theory on relevant topics in data reconciliation
and parameter estimation, along with existing works in the literature, is given in Chapter
2. Chapter 3 starts with an introduction to the Generalized T (GT) distribution, and goes
on to describe the proposed GT-based joint data reconciliation – parameter estimation


4


strategy. Chapter 4 describes the application case study and gives overview of some of
the settings used in the case studies in Chapter 5, where the results of the case studies are
presented and discussed in detail. The thesis is then concluded with Chapter 6.

5


CHAPTER 2: THEORY & LITERATURE REVIEW

2.1. Introduction
Data reconciliation aims to improve the accuracy of measurement data by enforcing the
data consistency. It uses estimation theory and subjects the optimization to model balance
equations. Two important estimation criteria are robustness and efficiency. In this
chapter, an introduction to data reconciliation is provided along with its important
aspects, including the incorporation of robustness and variable classification. Relevant to
the estimation criteria, the concept of robustness in statistical sense and an approach to
improve estimation efficiency is presented. A section is also devoted to joint data
reconciliation and parameter estimation, a strategy that combines the two estimation
procedures to simultaneously obtain consistent variable and parameter estimates.

2.2. Data Reconciliation (DR)
Measurements always contain some form of errors. Errors can be classified into random
error and gross error. Random errors are caused by natural fluctuations and variability
inherent in the process; they occur randomly and are typically small in magnitude. Gross
errors, on the other hand, are large in magnitude but occur less frequently; their
occurrences can be attributed to incorrect calibration or malfunction of instruments,
process leaks, and other unnatural causes.


In order to obtain objective knowledge of the actual state of the process, accurate data
must be used, requiring erroneous measurements to be first corrected before used. Data

6


reconciliation is such error correction technique that makes use of simple, well-known
and indubitable process relationships that should be satisfied regardless of the
measurement accuracy, i.e. the multicomponent mass and energy balances (Romagnoli
and Sanchez, 2000).

The presence of errors in the measurement of process variables gives rise to discrepancies
in the mass and energy balances. Data reconciliation adjusts or reconciles the
measurements to obtain estimates of the corresponding process variables that are more
accurate and consistent with the process mass and energy balances. The adjustment of
measurements is such that certain optimality regarding the characteristic of the error is
achieved. Mathematically, the general data reconciliation translates into the following
constrained optimization problem:
(reconciled variables) = arg min (optimality criteria)
subject to
(mass and energy balances; variable bounds)

To illustrate more clearly, the data reconciliation problem using the weighted least
squares method will be formulated in the following.

Denote as
y, an (mx1) vector of measurements;
x, an (mx1) vector of corresponding true values of the variables with measurements y;
u, a (px1) vector of unmeasured variables;


7


and g(x,u)=0, the multicomponent mass and energy balance equations of the process; the
data reconciliation problem for weighted least square estimator can then be formulated
as:
Min ( y − x) T Ψ −1 ( y − x)
x, u

s.t.
g ( x, u ) = 0

(2.1)

x L ≤ x ≤ xu
u L ≤ u ≤ uu

where Ψ is the measurement error covariance matrix, x L and u L the lower bounds on x
and u, respectively, and xU and uU the upper bounds on x and u, respectively.

Three features are observed from the above formulation:
(1) Firstly, the objective function of the optimization is the square of the adjustments
made to the measurements y, weighted by the inverse of the error covariance matrix.
This corresponds to the weighted least square estimator used in the problem. The
objective function of the data reconciliation optimization problem is in fact
determined by the estimator applied to the problem. The choice of estimator is, in
turn, usually dependent on the assumption regarding the error characteristics. For
example, the use of weighted least square reflects the assumption that the error is
small relative to its standard deviation such that the measurement must lie somewhere

within very few standard deviations from the true value of the variable. In fact, if the
weighted least square is chosen based on maximum likelihood consideration, the error
is assumed to follow multivariate normal distribution with mean zero and covariance

8


Ψ . To demonstrate this, consider the likelihood function of the multivariate normal
distribution

f (ε ) = (2π det(Ψ )) −1 exp(−ε T Ψ −1ε ) ,

(2.2)

where ε = y − x ; the maximum of f (ε ) is obtained by minimizing ε T Ψ −1ε , i.e. the
weighted least square of adjustments. As will be discussed in Section 2.4 on
robustness, the adequacy of the assumption regarding the error and hence, the choice
of estimator plays an important role in ensuring the accuracy of the reconciled data in
all situations.

(2) Secondly, the constraint g(x,u) comprises the mass and energy balances of the
process. Together with the form of the objective function, the constraint equation
determines the difficulty of the DR optimization. If only total mass balances are
considered and the weighted least square objective function is used, the optimization
problem will have quadratic objective function with linear constraints. This kind of
optimization can be solved analytically, i.e. closed-form solution can be obtained.
However, as using merely mass balances limits the reconciliation to only flow
measurements, component and energy balances are usually also considered. This
results in non-linear optimization problem for which analytical solution usually does
not exist. Several optimization methods are proposed for this case, including the QR

orthogonal factorization for bilinear systems (Crowe, 1986), successive linearization
(Swartz, 1989), and nonlinear numerical optimization methods such as sequential

9


quadratic programming (Gill et al., 1986; Tjoa and Biegler, 1991). In this thesis, the
sequential quadratic programming (SQP) is used, as it is not restricted to linear or
bilinear systems, and is flexible in terms of the form of objective function. Although
convexity is essential to guarantee convergence to the true optimum, the algorithm
also converges to satisfactory solutions for many non-convex problems, as will be
demonstrated by the estimation results in Chapter 5.

(3) Thirdly, the optimization problem involves measured and unmeasured variables, x
and u, respectively. A very important procedure must be taken before formulating the
data reconciliation problem. This procedure is called variable classification. Given the
knowledge of which variables are measured, the balance equations can be analysed to
identify which measured variables are redundant or non-redundant, and which
unmeasured variables are determinable or undeterminable. The value of a
determinable variable can be determined from the value of measured variables
through the model equations, whereas an undeterminable variable is not involved in
such equations, and hence its value cannot be determined from the values of other
variables. A redundant variable is a variable whose value can still be determined from
measurements of other variables through the model equations, even if its own
measurement is deleted. On the contrary, if the measurement of a non-redundant
variable is deleted, its value becomes undeterminable.

In the actual optimization step of data reconciliation, the decision variables will
consist of only the measured variables, as adjustments can only be made to


10


measurements. The calculation of determinable unmeasured variables is performed
using the reconciled values of the measured variables, and hence, this calculation can
only be carried out after the optimization is completed. Therefore it can be said that
the problem in equation (2.1) is decomposed into two steps, the optimization to obtain
reconciled measurements, and the calculation of determinable variables.

Various methods have been proposed for this variable classification / problem
decomposition (Romagnoli and Stephanopoulos, 1980; Sanchez et al, 1992; Crowe,
1989; Joris and Kalitventzeff, 1987). However, any formal method is restricted to
linear or linearized model equations.

2.3. Joint Data Reconciliation – Parameter Estimation (DRPE)
Parameters of a process are often not known and have to be estimated from the
measurements of the process variables. These estimated parameters are often important
for design, evaluation, optimization and control of the process. As measurements of
process variables are corrupted by errors, the measurements are usually reconciled first
before being used for parameter estimation. This results in two separate estimations with
two sets of variable estimates, i.e. the reconciled data satisfying the process constraints
and the data fitted with the model parameter estimate. It is most likely that these two sets
of data are not similar, albeit representing the same physical quantities. In this thesis, the
two estimation steps corresponding to data reconciliation and parameter estimation are
merged into a single joint data reconciliation – parameter estimation (DRPE) step.

11


The problem formulation, taking the weighted least square objective function as an

example, is as follows.

Min ( y − x) T Ψ −1 ( y − x)
x, u,θ

s.t.
g ( x, u , θ ) = 0
x L ≤ x ≤ xu

(2.3)

u L ≤ u ≤ uu

θ L ≤θ ≤θu

θ is the model parameter to be estimated, while the meaning of the other symbols are as
in equation (2.1). It should, however, be noted that the vector of measurements y may
now contain non-redundant measured variables which are involved in the equations to
estimate θ now included in g. Because all measurements, both redundant and nonredundant are subject to the constraints which now includes data reconciliation process
constraints and parameter estimation model equations, the resulting estimates of both
variables and model parameters are now consistent with the whole set of constraints.

The joint data reconciliation – parameter estimation is also a general formulation of the
error-in-variables method (EVM) in parameter estimation, where there is no distinction
between independent and dependent variables and all variables are subject to
measurement errors. Main aspects of the joint DRPE include the general algorithm for the
solution strategy, the optimization strategy and the robustness of the estimation. The
optimization and robustness issues are similar to those in data reconciliation; therefore, it

12



will not be discussed here. The optimization strategy is discussed in Section 2.2, while
the estimation robustness is discussed in Section 2.4.

In error-in-variables method (EVM), the need for efficient general algorithm for the
solution strategy arises due to the large optimization problem that results from the
aggregation of independent and dependent variables in the estimation. From the point of
view of data reconciliation, the computation complexity is due to the addition of nonredundant variables and the unknown model parameters to be estimated. The general
algorithm can be distinguished into three main approaches (Romagnoli and Sanchez,
2000):
(1) Simultaneous solution methods
This is the most straightforward approach, i.e. solving the joint estimation of
variables and model parameters simultaneously. This approach relies on efficient
optimization method that is able to handle large-scale problems involving large
amount of decision variables. This is because considering there are p model
parameters to be estimated and N set of measurements of m variables, the number of
decision variables will be (p+Nm).

(2) Nested EVM
Reilly and Patino-Leal (1981) was the first to propose the idea of nested EVM, i.e.
decoupling the parameter estimation problem from the data reconciliation problem,
where the data reconciliation problem is optimized at each iteration of the parameter
estimation problem. While they used successive linearization for the constraints, Kim

13


et al (1990) later replaced the linearization with the more general non-linear
programming. The algorithm due to Kim et al is as follows (Romagnoli and Sanchez,

2000):

Step 1:
At the first iteration, x = y, and θ = θ 0 (initial guess of the parameter)

Step 2:
Find the minimum of the function for x and θ :
Min ( y − x) T Ψ −1 ( y − x)
θ

s.t.
Min ( y − x) T Ψ −1 ( y − x)
x

s.t.
g ( x, θ ) = 0
x L ≤ x ≤ xu

θ L ≤θ ≤θu

The optimization size is reduced to the same order as the number of parameters to be
estimated.

(3) Two-stage EVM
Valko and Vadja (1987) proposed an algorithm where the data reconciliation and
parameter estimation steps are essentially also decoupled. By linearizing the
constraints and solving the weighted least square optimization analytically, they

14



utilizes the analytical solution to manipulate the problem such that it can be reformulated into two optimization stages: the first stage to optimize for the model
parameters, keeping the variable estimates fixed at the results from previous iteration,
and the second to optimize for the variable estimates, keeping the parameter values
fixed at the values obtained from the preceding step. The resulting algorithm is
compact yet flexible. However, the ability to decouple the problem into two stages
depends heavily on whether the optimization problem can be solved analytically.
Therefore, it is restricted to only few very simple estimators such as the weighted
least squares.

2.4. Robust Estimation
The conventional and most prevalent form of estimator is the weighted least squares
formulation. It has been shown in Section 2.1 that if maximum likelihood estimation is
considered, the weighted least square estimates are the maximum likelihood estimates
when the measurement errors follow the multivariate normal (Gaussian) distribution in
equation (2.2). However, the normality assumption is rather restrictive; it assumes that a
measurement will lie within a small range around the true variable value, that is, the error
consists of natural variability of the measurement process. The presence of gross errors,
whose magnitudes are considerably large compared to the standard deviation of the
assumed normal distribution, presents the risk of the weighted least square estimates
becoming heavily biased. Besides, the natural variability of the measurement process
may also be better characterized by distributions other than normal.

15


The usual approach to deal with departure from normality is by detecting the presence of
gross errors through statistical tests. Some typical gross error detection schemes are the
global test, nodal test, and measurement test (Serth and Heenan, 1986; Narasimhan and
Mah, 1987). The global and nodal tests are based on residuals of model constraints, while

the measurement test is based on Neyman-Pearson hypothesis testing using the residuals
between the measurements and the estimates (Wang, 2003). The limitations of these
complementary tests are as follows. Firstly, the statistical tests are still based on the
assumption that the error is normally distributed; they detect deviations from normal
distribution. When the empirical character of the error differs significantly from the
normal distribution, the results of the statistical test might be very misleading. The
second limitation is the restrictions of the tests to linear or linearized constraint equations.
In a practical setting, the model equations are usually nonlinear and linearization will
introduce some approximation errors that will confound the statistical gross error tests.

An alternative approach to deal with gross errors is to reformulate the objective function
to take into account the presence of gross errors from the beginning, such that gross
errors can be suppressed while performing the estimation. In this case, there is no need
for a separate procedure such as the previously mentioned statistical tests to recover from
gross errors. A seminal work that proposed this approach is the maximum likelihood
estimator based on the contaminated normal density proposed by Tjoa and Biegler
(1991). Instead of assuming purely normal error distribution, Tjoa and Biegler combined
two normal distributions: a narrow Gaussian with the same standard deviation as that of
the weighted least squares method, and a wide Gaussian with considerably larger

16


standard deviation to represent gross errors. The density function of the contaminated
normal distribution can be expressed as

f (u) = (1− p)

u2
u2

1
1
exp(− 2 2 )
exp(− 2 ) + p
2σ b

2πσb
2πσ

(2.4)

where u is the measurement residual, p the probability of gross errors, b the ratio of the
standard deviation of the wide Gaussian to the narrow one, and σ the standard deviation
of the narrow Gaussian. As illustrated in Figure 2.1, this distribution has heavier tails
than the uncontaminated normal distribution, which means that it recognizes the
possibility of gross error occurring (i.e. with a probability of p as in equation 2.4). In their
paper, Tjoa and Biegler showed that the estimator is able to detect gross errors and to
recover from them in most of the cases studied.

To study the robustness of an estimator, a unifying theoretical framework has been
proposed by Huber (1981) and Hampel (1986). The analysis based on Hampel et al’s
influence function (IF) will be adopted here. To simplify presentation, the derivation of
formulae will be omitted here; details can be found in Hampel et al (1986). The influence
function aims to describe the behaviour of an estimator in the neighbourhood of the
parametric distribution assumed by the estimator. If the residual u is drawn from a
distribution with density f(u), and if T[f(u)] is the unbiased estimate corresponding to u,
then the influence function of a residual u0 is given by

17



×