Thuật toán DCA và các bài toán quy hoạch toàn phương không lồi

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (216.07 KB, 26 trang )

VIETNAM ACADEMY OF SCIENCE AND TECHNOLOGY
INSTITUTE OF MATHEMATICS
Hoang Ngoc Tuan
DC ALGORITHMS AND
NONCONVEX QUADRATIC PROGRAMMING
Speciality: Applied Mathematics
Speciality code: 62 46 01 12
SUMMARY OF PH.D. DISSERTATION
SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY IN MATHEMATICS
Supervisor:
Prof. Dr. Hab. Nguyen Dong Yen
HANOI - 2015
The dissertation was written on the basis of the author’s research works
carried at Institute of Mathematics, Vietnam Academy of Science and Tech-
nology.
Supervisor: Prof. Dr. Hab. Nguyen Dong Yen
First referee:

Second referee:

Third referee:

To be defended at the Jury of Institute of Mathematics, Vietnam Academy
of Science and Technology:

on , at o’clock
The dissertation is publicly available at:
• The National Library of Vietnam

• The Library of Institute of Mathematics
Introduction
Convex functions have many nice properties. For instance, a convex func-
tion, say ϕ : R
n
→ R, is continuous, directionally diﬀerentiable, locally
Lipschitz at any point u ∈ R
n
. In addition, ϕ is Fr´echet diﬀerentiable al-
most everywhere on R
n
, i.e., the set of points where the gradient ∇ϕ(x)
exists is of full Lebesgue measure.
It is also known that the subdiﬀerential
∂ϕ(u) := {x
∗
∈ R
n
: x
∗
, x − u ≤ ϕ(x) − ϕ(u) ∀x ∈ R
n
}
of a convex function ϕ : R
n
→ R ∪ {+∞} at
u ∈ dom ϕ := {x ∈ R
n
: ϕ(x) < +∞}
is a closed, convex set. If x /∈ dom ϕ then one puts ∂ϕ(x) = ∅. The Fermat

Rule for convex optimization problems asserts that ¯x ∈ R
n
is a solution of
the minimization problem
min{ϕ(x) : x ∈ R
n
}
if and only if 0 ∈ ∂ϕ(¯x).
Convex analysis is a powerful machinery for dealing with convex optimiza-
tion problems. Note that convex programming is an important branch of
optimization theory, which continues to draw attention of many researchers
worldwide until now.
If f : R
n
→ R is a given function, and if there exist convex functions
g : R
n
→ R and h : R
n
→ R such that f(x) = g(x) − h(x) for every
x ∈ R
n
, then f is called a d.c. function. The abbreviation “d.c.” here
1
comes from the combination of words “Diﬀerence (of) Convex (functions”.
More generally, a function f : R
n
→ R, where R = R ∪ {±∞}, is said to be
a d.c. function if there are lower semicontinuous, proper, convex functions
g, h : R

n
→ R such that f(x) = g(x) − h(x) for all x ∈ R
n
. The convention
(+∞) − (+∞) = +∞ is used here. Despite their (possible) nonconvexity,
d.c. functions still enjoy some good properties of convex functions.
A minimization problem with a geometrical constraint
min{f(x) = g(x) − h(x) : x ∈ C}, (0.1)
where f, g and h are given as above, and C ⊂ R
n
is a nonempty closed
convex set, is a typical DC programming problem. Setting

f(x) = (g(x) + δ
C
(x)) − h(x),
where δ
C
, with δ
C
(x) = 0 for all x ∈ C and δ
C
(x) = +∞ for all x /∈ C, is
the indicator function of the set C, one can easily transform (0.1) to
min{

f(x) : x ∈ R
n
}, (0.2)
which is an unconstrained DC programming problem, with


f(x) being a d.c.
function.
DC programming and DC algorithms (DCA, for brevity) treat the problem
of minimizing a function f = g − h, with g, h being lower semicontinuous,
proper, convex functions on R
n
, on the whole space. Usually, g and h are
called d.c. components of f. The DCA are constructed on the basis of the
DC programming theory and the duality theory of J. F. Toland. It was Pham
Dinh Tao who suggested a general DCA theory, which have been developed
intensively by him and Le Thi Hoai An starting from their fundamental
paper “Convex analysis approach to D.C. programming: Theory, algorithms
and applications” (Acta Mathematica Vietnamica, Vol. 22, 1997).
Note that DC programming is among the most successful convex analysis
approaches to nonconvex programming. One wishes to make an extension
of convex programming, not too wide so that the powerful tools of convex
analysis and convex optimization still can be used, but suﬃciently large to
2
cover the most important nonconvex optimization problems. The set of d.c.
functions, which is closed under the basic operations usually considered
in optimization, can serve well the purpose. Note that the convexity of
the two components of the objective function has been employed widely in
DC programming to obtain essential theoretical results and to construct
eﬃcient solution methods. The DC duality scheme of J. F. Toland, is an
example of such essential theoretical results. To be more precise, Toland’s
Duality Theorem asserts that, under mild conditions, the dual problem of
a DC program is also a DC program, and the two problems have the same
optimal value.
Due to their local character, DCA (i.e., DC algorithms) do not ensure

the convergence of an iteration sequence to a global solution of the problem
in question. However, with the help of a restart procedure, DCA applied
to trust-region subproblems can yield a global solution of the problem. In
practice, DCA have been successfully applied to many diﬀerent nonconvex
optimization problems for which it has proved to be more robust and eﬃcient
than many standard methods; in particular, DCA work well for large-scale
problems. Note also that, with appropriate decompositions of the objective
functions, DCA can generate several standard algorithms in convex and
nonconvex programming.
This dissertation studies the DCA applied for minimizing a quadratic
problem on an Euclidean ball (also called the trust-region subproblem) and for
minimizing a quadratic function on a polyhedral convex set. These problems
play important roles in optimization theory.
Let A ∈ R
n×n
be a symmetric matrix, b ∈ R
n
be a given vector, and r > 0
be a real number. The nonconvex quadratic programming with convex
constraints
min

f(x) :=
1
2
x
T
Ax + b
T
x : x

2
≤ r
2

,
where x =

n

i=1
x
2
i

1/2
denotes the Euclidean norm of x = (x
1
, . . . , x
n
)
T
∈
R
n
and
T
means the matrix transposition, is called the trust-region subprob-
3
lem.
One encounters with the problem while applying the trust-region method

(see, e.g., A. R. Conn, N. I. M. Gould, and P. L. Toint, “Trust-Region
Methods”, 2000) to solve the unconstrained problem of ﬁnding the minimum
of a C
2
–smooth function ϕ : R
n
→ R. Having an approximate solution x
k
at a step k of the trust-region method, to get a better approximate solution
x
k+1
one ﬁnds the minimum of ϕ on a ball with center x
k
and a radius
depending on a ratio deﬁned by some calculations on ϕ and the point x
k
.
If one replaces ϕ with its second-order Taylor expansion around x
k
, the
auxiliary problem of the form of the trust-region subproblem appears, and
x
k+1
is a solution of this problem.
Consider a quadratic programming problem under linear constraints of
the form
min

f(x) :=
1

2
x
T
Qx + q
T
x : Dx ≥ d

where Q ∈ R
n×n
and D ∈ R
m×n
be given matrices, q ∈ R
n
and d ∈ R
m
be
given vectors. It is assumed that Q is symmetric. This class of optimiza-
tion problems is well known and has various applications. Basic qualitative
properties related to the solution existence, structure of the solution set,
ﬁrst-order necessary and suﬃcient optimality conditions, second-order nec-
essary and suﬃcient optimality conditions, stability, diﬀerential stability of
the problem can be found in the books of B. Bank, J. Guddat, D. Klatte, B.
Kummer, and K. Tammer, “Non-Linear Parametric Optimization” (1982),
R. W. Cottle, J S. Pang, and R. E. Stone, “The Linear Complementarity
Problem” (1992), G. M. Lee, N. N. Tam, and N. D. Yen, “Quadratic Pro-
gramming and Aﬃne Variational Inequalities: A Qualitative Study” (2005),
and the references therein.
The structure of the solution set and the Karush-Kuhn-Tucker point set
of this problem is far diﬀerent from the trust-region subproblem since the
constraint set of the trust-region subproblem is convex, compact, and has

smooth boundary.
Our aim is to study the convergence and the convergence rate of DCA
4
applied for the two mentioned problems. An open question and a conjecture
raised in the two papers by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J.
Global Optim., Vol. 49, 2011, pp. 481–495, and Vol. 53, 2012, pp. 317–329)
will be completely solved in this dissertation.
By using some advanced tools, we are able to obtain complete results
on convergence of DCA sequences. Moreover, convergence rates of DCA
sequences are established for the ﬁrst time in this dissertation.
The results of this dissertation complement and develop the corresponding
published results of T. Pham Dinh and H. A. Le Thi (SIAM J. Optim., Vol.
8, 1998), T. Pham Dinh, H. A. Le Thi, and F. Akoa (Optim. Methods
Softw., Vol. 23, 2008), H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J.
Global Optim., Vol 49, 2011; Vol. 53, 2012).
The dissertation has three chapters and a list of references.
Chapter 1 “Preliminaries” presents basic concepts and results of a general
theory on DC programming and DCA.
Chapter 2 “Minimization of a Quadratic Function on an Euclidean Ball”
considers an application of DCA to trust-region subproblems. Here we
present in details an useful restart procedure that allows the algorithm to
ﬁnd a global solution. We also give an answer in the aﬃrmative to the ques-
tion raised by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (2012) about the
convergence of DCA. Furthermore, the convergence rate of DCA is studied.
Chapter 3 “Minimization of a Quadratic Function on a Polyhedral Con-
vex Set” investigates an application of DCA to indeﬁnite quadratic programs
under linear constraints. Here we solve in the aﬃrmative a conjecture raised
by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (2011) about the bound-
edness of the DCA sequences. At ﬁrst, by a direct proof, we obtain the
boundedness of the DCA sequences for quadratic programs in R

2
. Then, by
using some error bounds for aﬃne variational inequalities, we establish the
R-linear convergence rate of the algorithm, hence give a complete solution
for the conjecture.
5
The results of Chapter 2 were published in Journal of Global Optimization
[5] (a joint work with N. D. Yen) and in Journal of Optimization Theory and
Applications [2]. Chapter 3 is written on the basis of the papers [3], which
was published in Journal of Optimization Theory and Applications, and of
the paper [4], which was published in Journal of Mathematical Analysis and
Aplications.
The above results were reported by the author of this dissertation at Semi-
nar of Department of Numerical Analysis and Scientiﬁc Computing of Hanoi
Institute of Mathematics, The 8th Vietnam-Korea Workshop “Mathemati-
cal Optimization Theory and Applications” (University of Dalat, December
8-10, 2011), The 5th International Conference on High Performance Scien-
tiﬁc Computing (March 5-9, 2012, Hanoi, Vietnam), The Joint Congress of
the French Mathematical Society (SMF) and the Vietnamese Mathemati-
cal Society (VMS) (University of Hue, August 20-24, 2012), The 8th Viet-
namese Mathematical Conference (Nha Trang, August 10-14, 2013), The
12th Workshop on Optimization and Scientiﬁc Computing (Ba Vi, April
23-25, 2014).
6
Chapter 1
Preliminaries
This chapter reviews some background materials of DC Algorithms. For
more details, we refer to H. A. Le Thi and T. Pham Dinh’s papers (1997,
1998), H. N. Tuan’s Master dissertation (“DC Algorithms and Applications
in Quadratic Programming”, Hanoi, 2010), and the references therein.

1.1 Toland’s Duality Theorem for DC Programs and
Iteration Algorithms
Consider the space R
n
which is equipped with the canonical inner product
·, ·. Then the dual space of R
n
can be identiﬁed with R
n
. A function
θ : R
n
→
R is said to be proper if it does not take the value −∞ and it is
not equal identically to +∞, i.e., there is some x ∈ R
n
with θ(x) ∈ R. The
eﬀective domain of θ is deﬁned by
dom θ := {x ∈ R
n
: θ(x) < +∞}.
Let Γ
0
(R
n
) be the set of all lower semicontinuous, proper, convex functions
on R
n
. The conjugate function g
∗

of the function g ∈ Γ
0
(R
n
) is deﬁned by
g
∗
(y) = sup{x, y − g(x) : x ∈ R
n
} ∀ y ∈ R
n
.
Note that g
∗
: R
n
→ R is also a lower semicontinuous, proper, convex
function. In the sequel, we use the convention (+∞)−(+∞)=(+∞).
7
Deﬁnition 1.1 The optimization problem
inf{f(x) := g(x) − h(x) : x ∈ R
n
}, (P)
where g and h are functions belonging to Γ
0
(R
n
), is called a DC program.
The functions g and h are called d.c. components of f.
Deﬁnition 1.2 For any g, h ∈ Γ

0
(R
n
), the DC program
inf{h
∗
(y) − g
∗
(y) : y ∈ R
n
}, (D)
is called the dual problem of (P).
Proposition 1.1 (Toland’s Duality Theorem) The DC programs (P) and
(D) have the same optimal value.
Deﬁnition 1.3 A vector x
∗
∈ R
n
is said to be a a local solution of (P) if
f(x
∗
) = g(x
∗
) − h(x
∗
) is ﬁnite (i.e., x
∗
∈ dom g ∩ dom h) and there exists a
neighborhood U of x
∗

such that
g(x
∗
) − h(x
∗
) ≤ g(x) − h(x), ∀x ∈ U.
If we can choose U = R
n
, then x
∗
is called a (global) solution of (P).
Proposition 1.2 (First-order optimality condition) If x
∗
is a local solution
of (P), then ∂h(x
∗
) ⊂ ∂g(x
∗
).
Deﬁnition 1.4 A point x
∗
∈ R
n
satisfying ∂h(x
∗
) ⊂ ∂g(x
∗
) is called a
stationary point of (P).
Deﬁnition 1.5 A point x

∗
∈ R
n
is said to be a critical point of (P) if
∂g(x
∗
) ∩ ∂h(x
∗
) = ∅.
If ∂h(x
∗
) = ∅ and x
∗
is a stationary point of (P), then x
∗
is a critical
point of (P). The reverse implication does not hold in general.
The idea of the theory of DC algorithms (DCA for brevity) is to construct
two sequences {x
k
} and {y
k
} (approximate solution sequences of (P) and
(D), respectively) in an appropriate way such that:
8
(i) The sequences {(g − h)(x
k
)} and {(h
∗
− g

∗
)(y
k
)} are decreasing;
(ii) The cluster point x
∗
(resp. y
∗
) of {x
k
} (resp., of {y
k
}) is the critical
point of (P) (resp., of (D)).
The general DC algorithm by H. A. Le Thi and T. Pham Dinh (1997) is
formulated as follows.
DCA
• Choose x
0
∈ dom g.
• For each k ≥ 0, if x
k
has been deﬁned, select a vector
y
k
∈ ∂h(x
k
). (1.1)
• For each k ≥ 0, if y
k

has been deﬁned, select a vector
x
k+1
∈ ∂g
∗
(y
k
). (1.2)
It can be shown that y
k
is satisﬁed (1.1) if and only if it is a solution of
the convex program
min{h
∗
(y) − x
k
, y : y ∈ R
n
}. (D
k
)
Similarly, x
k+1
satisﬁed (1.2) if and only if it is a solution of the convex
program
min{g(x) − x, y
k
 : x ∈ R
n
}. (P

k
)
A simpliﬁed version of DCA with adding a termination procedure can be
stated as follows:
• Step 1
Choose x
0
∈ dom g. Take  ≥ 0.
Put k = 0.
• Step 2
Solve (D
k
) by some algorithm from convex programming to get a solution
y
k
.
Solve (P
k
) by some algorithm from convex programming to get a solution
9
x
k+1
.
Check the condition ||x
k+1
− x
k
|| < . If it is satisﬁed, then terminate the
computation; otherwise, go to Step 3.
• Step 3

Increase k by 1 and return to Step 2.
1.2 General Convergence Theorem
Deﬁnition 1.6 Let ρ ≥ 0 and C be a convex set in the space R
n
. A function
θ : C → R ∪ {+∞} is called ρ-convex if
θ

λx + (1 − λ)x


≤ λθ(x) + (1 − λ)θ(x

) −
λ(1 − λ)
2
ρ || x − x

||
2
for all numbers λ ∈ (0, 1) and vectors x, x

∈ C. This amounts to saying
that the function θ(·) − (ρ/2)|| · ||
2
is convex on C.
Deﬁnition 1.7 The modulus of convexity of θ on C is given by
ρ(θ, C) = sup

ρ ≥ 0 : θ − (ρ/2)|| · ||

2
is convex on C

.
If C = R
n
then we write ρ(θ) instead of ρ(θ, C). Function θ is called strongly
convex on C if ρ(θ, C) > 0.
Consider the problem (P). If ρ(g) > 0 (resp., ρ(g
∗
) > 0), let ρ
1
(resp.,
ρ
∗
1
) be real numbers such that 0 ≤ ρ
1
< ρ(g) (resp., 0 ≤ ρ
∗
1
< ρ(g
∗
)). If
ρ(g) = 0 (resp., ρ(g
∗
) = 0), let ρ
1
= 0 (resp., ρ
∗

1
= 0). If ρ(h) > 0 (resp.,
ρ(h
∗
) > 0), let ρ
2
(resp., ρ
∗
2
) be real numbers such that 0 ≤ ρ
2
< ρ(h) (resp.,
0 ≤ ρ
∗
2
< ρ(h
∗
)). If ρ(h) = 0 (resp., ρ(h
∗
) = 0), let ρ
2
= 0 (resp., ρ
∗
2
= 0).
One adopts the abbreviations dx
k
= x
k+1
− x

k
and dy
k
= y
k+1
− y
k
. Let
α := inf{f(x) = g(x) − h(x) : x ∈ R
n
}.
Theorem 1.1 Assume that {x
k
} and {y
k
} are generated by the DCA. We
10
have
(i) (g − h)(x
k+1
) ≤ (h
∗
− g
∗
)(y
k
) − max

ρ
2

2
||dx
k
||
2
,
ρ
∗
2
2
||dy
k
||
2

≤ (g − h)(x
k
) − max

ρ
1
+ρ
2
2
||dx
k
||
2
,
ρ

∗
1
2
||dy
k−1
||
2
+
ρ
2
2
||dx
k
||
2
,
ρ
∗
1
2
||dy
k−1
||
2
+
ρ
∗
2
2
||dy

k
||
2

;
(ii) (h
∗
− g
∗
)(y
k+1
) ≤ (g − h)(x
k+1
) − max

ρ
1
2
||dx
k+1
||
2
,
ρ
∗
1
2
||dy
k
||

2

≤ (h
∗
− g
∗
)(y
k
) − max

ρ
∗
1
+ρ
∗
2
2
||dy
k
||
2
,
ρ
1
2
×
||dx
k+1
||
2

+
ρ
2
2
||dx
k
||
2
,
ρ
∗
1
2
||dy
k
||
2
+
ρ
2
2
||dx
k
||
2

;
(iii) If α is ﬁnite, then {(g − h)(x
k
)} and {(h

∗
− g
∗
)(y
k
)} are decreasing
sequences that converge to the same limit β ≥ α. Furthermore,
(a) If ρ(g) + ρ(h) > 0 (resp., ρ(g
∗
) + ρ(h
∗
) > 0), then
lim
k→∞
{x
k+1
− x
k
} = 0 (resp., lim
k→∞
{y
k+1
− y
k
} = 0);
(b) lim
k→∞
{g(x
k
) + g

∗
(y
k
) − x
k
, y
k
} = 0 ;
(c) lim
k→∞
{h(x
k+1
) + h
∗
(y
k
) − x
k+1
, y
k
} = 0.
(iv) If α is ﬁnite, and {x
k
} and {y
k
} are bounded, then for every cluster
point x
∗
of {x
k

} (resp., y
∗
of {y
k
}), there is a cluster point y
∗
of {y
k
} (resp.,
x
∗
of {x
k
}) such that:
(d) (x
∗
, y
∗
) ∈ [∂g
∗
(y
∗
) ∩ ∂h
∗
(y
∗
)] × [∂g(x
∗
) ∩ ∂h(x
∗

)],
(e) (g − h)(x
∗
) = (h
∗
− g
∗
)(y
∗
) = β,
(f) lim
k→∞
{g(x
k
) + g
∗
(y
k
)} = lim
k→∞
x
k
, y
k
.
11
Chapter 2
Minimization of a Quadratic
Function on an Euclidean Ball
In this chapter, we prove that any DCA sequence constructed by the Pham

Dinh–Le Thi algorithm for the trust-region subproblem converges to a Karush-
Kuhn-Tucker point. We also obtain suﬃcient conditions for the Q-linear
convergence of DCA sequences. In addition, we give two examples to show
that, if the suﬃcient conditions are not satisﬁed, then the sequences may
not be Q-linearly convergent.
This chapter is written on the basis of the papers [2, 5]. A part of the
results from [4] is used in the ﬁnal section of this chapter.
2.1 The Trust-Region Subproblem
Let A ∈ R
n×n
be a symmetric matrix, b ∈ R
n
be a given vector, and r > 0
be a real number. The trust-region subproblem corresponding to the triple
{A, b, r} is the optimization problem
min

f(x) :=
1
2
x
T
Ax + b
T
x : x
2
≤ r
2

. (2.1)

It is well-known that if x ∈ E := {x ∈ R
n
: x ≤ r} is a local minimum
of (2.1), then there exists a unique Lagrange multiplier λ ≥ 0 such that
(A + λI)x = −b, λ(x − r) = 0, (2.2)
12
where I denotes the n × n unit matrix. If x ∈ E and there exists λ ≥ 0
satisfying (2.2), x is said to be a Karush-Kuhn-Tucker point (or a KKT
point) of (2.1).
2.2 Solving the Trust-Region Subproblem by a DC
Algorithm
2.2.1 The Pham Dinh–Le Thi Algorithm
Applying the DCA presented in Chapter 1 to (2.1), we have the following
iterative algorithm:
1. Choose ρ > 0 so that ρI − A is a positive semideﬁnite matrix.
2. Fix an initial point x
0
∈ R
n
and a constant ε ≥ 0 (a tolerance). Set
k = 0.
3. If
ρ
−1
(ρI − A)x
k
− b ≤ r, (2.3)
then take
x
k+1

= ρ
−1

(ρI − A)x
k
− b

. (2.4)
Otherwise, set
x
k+1
= r(ρI − A)x
k
− b
−1

(ρI − A)x
k
− b

. (2.5)
4. If x
k+1
− x
k
 < ε, then terminate the computation. Otherwise, in-
crease k by 1 and resume the test (2.3).
For ε = 0, the above algorithm generates an inﬁnite sequence {x
k
}

k≥0
,
called a DCA sequence.
Basic properties of DCA sequences produced by the above algorithm can
be sated as follows.
Theorem 2.1 (Pham Dinh and Le Thi, 1998) For any k ≥ 1,
f(x
k+1
) ≤ f(x
k
) −
1
2
(ρ + θ
1
)x
k+1
− x
k

2
,
13
where θ
1
is the smallest eigenvalue of the matrix ρI − A. If x
0
∈ E then
the inequality holds for all k ≥ 0. It holds lim
k→∞

x
k+1
− x
k
 = 0 and
f(x
k
) → β ≥ α as k → ∞, where α is the optimal value of (2.1) and
β is a constant depending on the choice of x
0
. In addition, every cluster
point of the sequence {x
k
} is a KKT point.
After proving that “if A is a nonsingular matrix that has no multiple
negative eigenvalues, then any DCA sequence of (2.1) converges to a KKT
point”, H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J. Global Optim.,
2012) have posed the following.
Question. Under what conditions is the DCA sequence {x
k
} convergent?
The next sections are aimed at solving completely the above Question. It
will be proved that any DCA sequence constructed by the Pham Dinh–Le
Thi algorithm for the trust-region subproblem (2.1) converges to a KKT
point.
2.2.2 Restart Procedure
DCA for ﬁnding global solutions of (2.1):
Start. Compute λ
1
(the smallest eigenvalue of A) , λ

n
(the largest eigen-
value of A) and an eigienvector u corresponding to λ
1
by a suitable algo-
rithm.
Take ρ > max{0, λ
n
}, x ∈ domf, stop:=false.
While stop=false do
1. Implement DCA with an initial point x
0
:= x to
get a KKT point x
∗
.
2. Take λ
∗
= (−x
∗
, Ax
∗
 − x
∗
, b)/r
2
.
If λ
∗
≥ −λ

1
then stop:=true , x
∗
is a global solution of
(TRS)
else using the procedure in T. Pham Dinh and H. A. Le Thi,
14
SIAM J. Optim. (1998) to ﬁnd a new vector x
such that f(x) < f(x
∗
) and return to 1.
end while
This enlarged version of DCA is no longer a local optimization algorithm.
In fact, it is a very eﬃcient global optimization algorithm to solve (P).
2.3 Auxiliary Results
This section presents 4 lemmas together with 2 corollaries, which are needed
in establishing the convergence of DCA sequences.
2.4 Convergence Theorem
The main result of this section is the following.
Theorem 2.2 For every initial point x
0
∈ R
n
, the DCA sequence {x
k
}
k≥0
for the trust-region subproblem (2.1) constructed by the Pham Dinh–Le Thi
algorithm (2.3)–(2.5) converges to a Karush-Kuhn-Tucker point of (2.1).
2.5 Illustrative Examples

This section gives two examples to illustrate the application of Theorem 2.2
for the trust-region subproblem.
2.6 Convergence Rate
A sequence {x
k
} ⊂ R
n
is said to be Q-linearly convergent to x
∗
if there is a
constant µ ∈ (0, 1) such that ||x
k+1
− x
∗
|| ≤ µ||x
k
− x
∗
|| for all suﬃciently
large k. If limsup
k→∞
||x
k
− x
∗

1
k
< 1 then one says that {x
k

} is R-linearly
15
convergent to x
∗
. It is well known that Q-linear convergence implies R-
linear convergence, but the reverse implication is not true.
Theorem 2.3 Let a DCA sequence {x
k
}
k≥0
for the trust-region subproblem
(2.1) be convergent to a KKT point x
∗
, and λ
∗
be the Lagrange multiplier
corresponding to x
∗
. Then we have
||x
k+1
− x
∗
|| ≤
ρ − λ
1

ρ(ρ + λ
∗
)

||x
k
− x
∗
||, (2.6)
for all suﬃciently large k. Here λ
1
denotes the smallest eigenvalue of A.
Hence, if
ρ − λ
1

ρ(ρ + λ
∗
)
∈ (0, 1) then {x
k
} converges Q-linearly to x
∗
.
From Theorem 2.3 we can derive the following suﬃcient conditions for
the linear convergence rate of DCA sequences for problem (2.1).
Theorem 2.4 Let {x
k
} be a DCA sequence of problem (2.1) converging to
a KKT point x
∗
. Let λ
∗
≥ 0 be the Lagrange multiplier associated with x

∗
.
The following is valid:
(a) If A is positive deﬁnite, then {x
k
} converges Q-linearly to x
∗
;
(b) If A is positive semideﬁnite, but not positive deﬁnite, and λ
∗
is
positive, then {x
k
} converges Q-linearly to x
∗
.
Deﬁnition 2.1 A KKT point of (2.1) is called proper if the Lagrange mul-
tiplier corresponding to it is a positive number.
By using the above properness concept, we can restate Claim (b) of The-
orem 2.4 as follows: If A is positive semideﬁnite, but not positive deﬁnite,
and x
∗
is a proper KKT point, then {x
k
} converges Q-linearly to x
∗
.
2.7 Further Analysis
In this section, we give two examples to show that, in general, the conver-
gence rate of {x

k
} may not be Q-linear if either A is not positive semideﬁnite
or x
∗
is improper.
16
2.8 Minimization of a Quadratic Form on a Ball Re-
visited
The convergence rate of the Pham Dinh–Le Thi algorithm, which was de-
scribed in Subsection 2.2.1 is classiﬁed furthermore in this section. Namely,
by an example we show that there is a DCA sequence which does not con-
verge R-linearly.
17
Chapter 3
Minimization of a Quadratic
Function on a Polyhedral Convex Set
In this chapter, we ﬁrst prove that any iteration sequence generated by
the Projection DC decomposition algorithm in quadratic programming is
bounded, provided that the quadratic program in question is two-dimensional
and solvable. Then, by using some error bounds for aﬃne variational in-
equalities we prove that any DCA sequence of that type is R-linearly con-
vergent, provided that the original problem has solutions.
Our results solve in the aﬃrmative the ﬁrst part of the Conjecture stated
by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (2011).
This chapter is written on the basis of the papers [3, 4].
3.1 Quadratic Problems with Linear Constraints
Let Q ∈ R
n×n
and D ∈ R
m×n

be given matrices, q ∈ R
n
and d ∈ R
m
be given vectors. Suppose that Q is symmetric. Consider the indeﬁnite
quadratic programming problem under linear constraints
min

f(x) :=
1
2
x
T
Qx + q
T
x : Dx ≥ d

. (3.1)
Let
C :=

x ∈ R
n
: Dx ≥ d

.
18
Since Q is not required to be positive semideﬁnite, (3.1) is a nonconvex
optimization problem in general.
This is a well-known polynomial optimization problem. Qualitative prop-

erties of the problem as well as numerical methods for its solving have been
discussed in many books and research papers .
3.2 Solving QP problems by a DC Algorithm
The general theory on DCA of Pham Dinh and Le Thi can be applied with
a success to indeﬁnite QPs in a similar way as it has been used for problem
(2.1). In fact, among the many solution methods for (3.1), the following
one of T. Pham Dinh, H. A. Le Thi, and F. Akoa (2008) deserves a special
attention due to its simplicity and eﬀectiveness in ﬁnding the stationary
point set of the problem.
Projection DC Decomposition Algorithm. For a given initial point
x
0
∈ R
n
, the DCA sequence {x
k
} corresponding to x
0
is deﬁned by the
iteration formula
x
k+1
:= P
C

x
k
−
1
ρ

(Qx
k
+ q)

, k = 0, 1, 2, . . . . (3.2)
Here the constant ρ > 0 should be larger than the largest eigenvalue of Q, and
P
C
(u) denotes the metric projection of u ∈ R
n
onto C; that is, P
C
(u) ∈ C
and u − P
C
(u) ≤ u − x for every x ∈ C.
Since Q is symmetric, the procedure of ﬁnding the largest eigenvalue of
Q is rather simple and it requires a very small computation time.
Deﬁnition 3.1 For any x ∈ R
n
, if there exists a multiplier λ ∈ R
m
such
that



Qx + q − A
T
λ = 0,

Ax ≥ b, λ ≥ 0, λ
T
(Ax − b) = 0,
then x is said to be a Karush-Kuhn-Tucker point (or a KKT point, for
brevity) of (3.1).
19
Theorem 3.1 (T. Pham Dinh, H. A. Le Thi, and F. Akoa, 2008) Every
DCA sequence {x
k
} generated by the DC Algorithm and an initial point
x
0
∈ R
n
possesses the following properties:
(i) f(x
k+1
) ≤ f(x
k
) −
ρ
2
x
k+1
− x
k

2
for every k ≥ 1;
(ii) The sequence {f(x

k
)} converges to an upper bound f
∗
for the optimal
value of (3.1);
(iii) Every cluster point of {x
k
} is a KKT point of (3.1);
(iv) If inf
x∈C
f(x) > −∞, then lim
k→∞
x
k+1
− x
k
 = 0.
Our aim is to solve the next conjecture, which is the ﬁrst part of the
Conjecture stated by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J.
Global Optim., 2011).
Conjecture 1 If (3.1) has solutions, then every DCA sequence generated
by the Projection DC decomposition algorithm in (3.2) must be bounded.
Conjecture 1 was solved by the authors under certain additional assump-
tions.
In accordance with the progresses of our long eﬀorts in ﬁnding a solution
of the conjecture, we ﬁrst solve Conjecture 1 in the case of two-dimensional
quadratic programs, without using any additional assumption on the data
sets. Then, by a diﬀerent approach, we give a complete solution to Conjec-
ture 1.
3.3 DCA Sequences in Two-Dimensional Spaces

Our result on the boundedness of the DCA sequences in two-dimensional
spaces can be stated as follows.
Theorem 3.2 If (3.1) has a global solution and if n = 2, then every DCA
sequence generated by Algorithm A is bounded.
20
The arguments for proving Theorem (3.2) cannot be applied to indeﬁnite
QPs of the form (3.1) where n ≥ 3.
Next, by employing some advanced tools (namely, the error bounds for
aﬃne variational inequalities due to Z. Q. Luo and P. Tseng, SIAM J. Op-
tim., Vol. 2, 1992, 43-54), we are able not only to solve completely the
above Conjecture 1 but also establish the R-linear convergence rate of the
DCA sequences deﬁned by (3.2). Our result shows clearly that the Pham
Dinh–Le Thi algorithm can solve eﬀectively the problem (3.1).
3.4 Complete Solution of the Conjecture
Denote by C
∗
the KKT point set of (3.1). The convergence and the rate of
convergence of the Projection DC decomposition algorithm can be described
as follows.
Theorem 3.3 If the problem (3.1) has a nonempty solution set, then for
each x
0
∈ R
n
, the DCA sequence {x
k
} constructed by Projection DC decom-
position algorithm (3.2) converges R-linearly to a KKT point of (3.1), that
is, there exists x
∗

∈ C
∗
such that
limsup
k→∞
||x
k
− x
∗
||
1/k
< 1.
21
General Conclusions
This dissertation investigates in detail the convergence and convergence rate
of iteration sequences generated by Pham Dinh–Le Thi’s projection algo-
rithms for solving two classes of nonconvex quadratic programs:
(a) the trust–region subproblems;
(b) quadratic programs on polyhedral convex sets.
Our main results include:
- A convergence theorem for a DC algorithm to the trust–region subprob-
lem; suﬃcient conditions for the Q-linear convergence of the DCA iteration
sequences in question.
- A direct proof of the boundedness of DCA sequences in two-dimensional
quadratic programming; a theorem on the convergence and the R-linear
convergence rate of DCA sequences for solving indeﬁnite quadratic programs
under linear constraints.
Our results complement and develop the corresponding published results
of T. Pham Dinh and H. A. Le Thi (SIAM J. Optim., Vol. 8, 1998), T.
Pham Dinh, H. A. Le Thi, and F. Akoa (Optim. Methods Softw., Vol. 23,

2008), H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J. Global Optim., Vol
49, 2011; Vol. 53, 2012).
Investigations on the following issues would be of interest:
1. Suﬃcient conditions for R-linear convergence of the DCA sequences
studied in Chapter 2;
2. Q-linear convergence of the DCA sequences studied in Chapter 3;
22
3. Solving (3.1) globally by DCA;
4. Convergence and convergence rate of DCA sequences generated by
Proximal DC decomposition algorithms;
5. Applicability of the general DC algorithm discussed in Chapter 1 to
optimization problems not belonging to the above classes (a) and (b).
23

Thuật toán DCA và các bài toán quy hoạch toàn phương không lồi

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về