Tải bản đầy đủ (.pdf) (19 trang)

64 A Unified Instrumental Variable

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (308.72 KB, 19 trang )

Stoica, P.; Viberg, M.; Wong, M. & Wu, Q.
Digital Signal Processing Handbook
Ed. Vijay K. Madisetti and Douglas B. Williams
Boca Raton: CRC Press LLC, 1999
c

1999byCRCPressLLC
“A Unified Instrumental Variable Approach to Direction Finding in Colored Noise Fields”Ó
64
A Unified Instrumental Variable
Approach to Direction Finding in
Colored Noise Fields
1
P. Stoica
Uppsala University
M. Viberg
Chalmers University of Technology
M. Wong
McMaster University
Q. Wu
CELWAVE
64.1 Introduction
64.2 Problem Formulation
64.3 The IV-SSF Approach
64.4 The Optimal IV-SSF Method
64.5 Algorithm Summary
64.6 Numerical Examples
64.7 Concluding Remarks
References
Appendix A: Introduction to IV Methods
The main goal herein is to describe and analyze, in a unifying manner, the spatial and


temporal IV-SSF approaches recently proposed for array signal processing in colored
noise fields. (The acronym IV-SSF stands for “Instrumental Variable - Signal Subspace
Fitting”). Despite the generality of the approach taken herein, our analysis technique is
simpler than those used in previous more specialized publications. We derive a general,
optimally-weighted (optimal, for short), IV-SSF direction estimator and show that this
estimator encompasses the UNCLE estimator of Wong and Wu, which is a spatial IV-SSF
method, and the temporal IV-SSF estimator of Viberg, Stoica and Ottersten. The latter
two estimators have seemingly different forms (among others, the first of them makes
use of four weights, whereas the second one uses three weights “only”), and hence their
asymptotic equivalence shown in this paper comes as a surprising unifying result. We
hopethatthepresent paper, along with the original worksaforementioned, will stimulate
theinterestintheIV-SSF approachtoarray signal processing, which issufficiently flexible
to handle colorednoise fields, coherent signals and indeed also situations wereonly some
of the sensors in the array are calibrated.
1
This work was supported in part by the Swedish Research Council for Engineering Sciences (TFR).
c

1999 by CRC Press LLC
64.1 Introduction
Most parametric methods for Direction-Of-Arrival (DOA) estimation require knowledge of the
spatial (sensor-to-sensor) color of the background noise. If this information is unavailable, a serious
degradation of the quality of the estimates can result, particularly at low Signal-to-Noise Ratio
(SNR) [1, 2, 3]. A number of methods have been proposed over the recent years to alleviate the
sensitivitytothenoisecolor. Ifaparametricmodelofthecovariancematrixofthenoiseisavailable,the
parameters of the noise model can be estimated along with those of the interesting signals [4, 5, 6, 7].
Such an approachis expected to perform well in situations where the noise can be accurately modeled
with relatively few parameters. An alternative approach, which does not require a precise model of
the noise, is based on the principle of Instrumental Variables (IV). See [8, 9] for thorough treatments
of IV methods (IVM) in the context of identification of linear time-invariant dynamical systems. A

brief introduction is given in the appendix of this chapter. Computationally simple IVMs for array
signal processing appeared in [10, 11]. These methods perform poorly in difficult scenarios involving
closely spaced DOAs and correlated signals.
More recently, the combined Instrumental Variable Signal Subspace Fitting (IV-SSF) technique
has been proposed as a promising alternative to array signal processing in spatially colored noise
fields [12, 13, 14, 15]. The IV-SSF approach has a number of appealing advantages over other DOA
estimation methods. These advantages include:
• IV-SSF can handle noises with arbitrary spatial correlation, under minor restrictions on
the signals or the array. In addition, estimation of a noise model is avoided, which leads
to statistical robustness and computational simplicity.
• The IV-SSF approach is applicable to both non-coherent and coherent signal scenarios.
• The spatial IV-SSF technique can make use of the information contained in the output of
a completely uncalibrated subarray under certain weak conditions, which other methods
cannot.
Depending on the type of “instrumental variables” used, two classes of IV methods have appeared
in the literature:
1. Spatial IVM, for which the instrumental variables are derived from the output of a (pos-
sibly uncalibrated) subarray the noise of which is uncorrelated with the noise in the main
calibrated subarray under consideration (see [12, 13]).
2. Temporal IVM, which obtains instrumental variables from the delayed versions of the
array output, under the assumption that the temporal-correlation length of the noise
field is shorter than that of the signals (see [11, 14]).
The previous literature on IV-SSF has treated and analyzed the above two classes of spatial and
temporal methods separately, ignoring their common basis. In this contribution, we reveal the
common roots of these two classes of DOA estimation methods and study them under the same
umbrella. Additionally, we establish the statistical properties of a general (either spatial or temporal)
weighted IV-SSF method and present the optimal weights that minimize the variance of the DOA
estimation errors. In particular, we point out that the optimal four-weight spatial IV-SSF of [12, 13]
(called UNCLE there, and arrived at by using canonical correlation decomposition ideas) and the
optimal three-weight temporal IV-SSF of [14] are asymptotically equivalent when used under the

same conditions. This asymptotic equivalence property, which is a main result of the present section,
is believed to be important as it shows the close ties that exist between two seemingly different DOA
estimators.
This section is organized as follows. In Section 64.2 the data model and technical assumptions
are introduced. Next, in Section 64.3 the IV-SSF method is presented in a fairly general setting. In
c

1999 by CRC Press LLC
Section 64.4, the statistical performance of the method is presented along with the optimal choices of
certain user-specifiedquantities. The data requirementsand the optimal IV-SSF (UNCLE) algorithm
are summarized in Section 64.5. The anxious reader may wish to jump directly to this point to
investigate the usefulness of the algorithm in a specific application. In Section 64.6, some numerical
examples and computer simulations are presented to illustrate the performance. The conclusions
are given in Section 64.7. In the appendix we give a brief introduction to IV methods. The reader
who is not familiar with IV might be helped by reading the appendix before the rest of the paper.
Background material on the subspace-based approach to DOA estimation can be found in Chapter 62
of this Handbook.
64.2 Problem Formulation
Consider a scenario in which n narrowband plane waves, generated by point sources, impinge on an
array comprising m calibrated sensors. Assume, for simplicity, that the n sources and the array are
situated in the same plane. Let a(θ) denote the complex array response to a unit-amplitude signal
with DOA parameter equal to θ. Under these assumptions, the output of the array, y(t) ∈ C
m×1
,
can be described by the following well-known equation [16, 17]:
y(t) = Ax(t) + e(t)
(64.1)
where x(t) ∈ C
n×1
denotes the signal vector, e(t ) ∈ C

m×1
is a noise term, and
A =[a(θ
1
) ···a(θ
n
)]
(64.2)
Hereafter, θ
k
denotes the kth DOA parameter.
The following assumptions on the quantities in the array equation, (64.1), are considered to hold
throughout this section:
A1. The signal vector x(t) is a normally distributed random variable with zero mean and a possibly
singular covariance. The signals may be temporally correlated; in fact the temporal IV-SSF approach
relies on the assumption that the signals exhibit some form of temporal correlation (see below for
details).
A2. The noise e(t ) is a random vector that is temporally white, uncorrelated with the signals and
circularlysymmetric normally distributed with zeromean and unknown covariancematrix
2
Q > O,
E [e(t )e

(s)]=Q δ
t,s
; E [e(t )e
T
(s)]=O
(64.3)
A3. The manifold vectors {a(θ)}, corresponding to any set of m different values of θ, are linearly

independent.
Note that assumption A1 above allows for coherent signals, and that in A2 the noise field is allowed
to be arbitrarily spatially correlated with an unknown covariance matrix. Assumption A3 isawell-
known condition that, under a weak restriction on m, guarantees DOA parameter identifiability in
the case Q is known (to within a multiplicative constant) [18]. When Q is completely unknown,
DOA identifiability can only be achieved if further assumptions are made on the scenario under
consideration. The following assumption is typical of the IV-SSF approach:
2
Henceforth, the superscript “∗” denotes the conjugate transpose; whereas the transpose is designated by a superscript
“T ”. The notation A ≥ B, for two Hermitian matrices A and B, is used to mean that (A − B) is a nonnegative definite
matrix. Also, O denotes a zero matrix of suitable dimension.
c

1999 by CRC Press LLC
A4. There exists a vector z(t) ∈ C
¯m×1
, which is normally distributed and satisfies
E [z(t)e

(s)]=O for t ≤ s
(64.4)
E [z(t)e
T
(s)]=O for all t,s
(64.5)
Furthermore, denote
 = E [z(t)x

(t)] ( ¯m × n)
(64.6)

¯n = rank () ≤¯m.
(64.7)
It is assumed that no row of  is identically zero and that the inequality
¯n>2n − m
(64.8)
holds (note that a rank-one  matrix can satisfy the condition (64.8)ifm is large enough, and hence
the condition in question is rather weak). Owing to its (partial) uncorrelatedness with {e(t )}, the
vector {z(t)} can be used to eliminate the noise from the array output equation (64.1), and for this
reason {z(t)} is called an IV vector. Below, we briefly describe three possible ways to derive an IV
vector from the available data measured with an array of sensors (for more details on this aspect, the
reader should consult [12, 13, 14]).
EXAMPLE 64.1:
Spatial IV
Assume that the n signals, which impinge on the main (sub)array under consideration, are also
receivedbyanother (sub)array thatis sufficiently distancedfromthe mainone so thatthe noise vectors
in the two subarrays are uncorrelated with one another. Then z(t ) can be made from the outputs of
the sensors in the second subarray (note that those sensors need not be calibrated) [12, 13, 15].
EXAMPLE 64.2:
Temporal IV
When a second subarray, as described above, is not available but the signals are temporally corre-
lated, one can obtain an IV vector by delaying the output vector: z(t ) =[y
T
(t −1) y
T
(t −2) ···]
T
.
Clearly, such a vector z(t) satisfies (64.4) and (64.5), and it also satisfies (64.8) under weak conditions
on the signal temporal correlation. This construction of an IV vector can be readily extended to cases
where e(t ) is temporally correlated, provided that the signal temporal correlation length is longer

than that corresponding to the noise [11, 14].
In a sense, the above examples are both special cases of the following more general situation:
EXAMPLE 64.3:
Reference Signal
In many systems a reference or pilot signal [19, 20] z(t ) (scalar or vector) is available. If the
reference signal is sufficiently correlated with all signals of interest (in the sense of (64.8)) and
uncorrelated with the noise, it can be used as an IV. Note that all signals that are not correlated with
the reference will be treated as noise. Reference signals are commonly available in communication
applications, for example a PN-code in spread spectrum communication [20] or a training signal
used for synchronization and/or equalizer training [21]. A closely related possibility is utilization of
cyclo-stationarity (or self-coherence), a property that is exhibited by many man-made signals. The
reference signal(s) can then consist, for example, of sinusoids of different frequencies [22, 23]. In
these techniques, the data is usually pre-processed by computing the auto-covariance function (or a
higher-order statistic) before correlating with the reference signal.
c

1999 by CRC Press LLC
The problem considered in this section concerns the estimation of the DOA vector
θ =[θ
1
, ···,θ
n
]
T
(64.9)
given N snapshots of the array output and of the IV vector, {y(t), z(t)}
N
t=1
. The number of signals,
n, and the rank of the covariance matrix , ¯n, are assumed to be given (for the estimation of these

integer-valued parameters by means of IV/SSF-based methods, we refer to [24, 25]).
64.3 The IV-SSF Approach
Let
ˆ
R =
ˆ
W
L

1
N
N

t=1
z(t)y

(t)

ˆ
W
R
( ¯m × m)
(64.10)
where
ˆ
W
L
and
ˆ
W

R
are two nonsingular Hermitian weighting matrices which are possibly data-
dependent (as indicated by the fact that they are roofed). Under the assumptions made, as N →∞,
ˆ
R converges to the matrix:
R = W
L
E[z(t)y

(t)]W
R
= W
L
A

W
R
(64.11)
where W
L
and W
R
are the limiting weighting matrices (assumed to be bounded and nonsingular).
Owing to assumptions A2 and A3,
rank (R) =¯n
(64.12)
Hence, the Singular Value Decomposition (SVD) [26]ofR can be written as
R =[U ?]

O

OO

S

?

= US

(64.13)
where U

U = S

S = I ,  ∈ R
¯nׯn
is diagonal and nonsingular, and where the question marks
stand for blocks that are of no importance for the present discussion.
The following key equality is obtained by comparing the two expressions for R in Eqs. (64.11)
and (64.13)above:
S = W
R
AC
(64.14)
where C

= 

W
L
U

−1
∈ C
nׯn
has full column rank. For a given S, the true DOA vector can be
obtained as the unique solution to Eq. (64.14) under the parameter identifiability condition (64.8)
(see, e.g., [18]). In the more realistic case when S is unknown, one can make use of Eq. (64.14)to
estimate the DOA vector in the following steps.
TheIVstep
— Compute the pre- and post-weighted sample covariance matrix
ˆ
R in
Eq. (64.10), along with its SVD:
ˆ
R =

ˆ
U ?


ˆ
O
O ?

ˆ
S

?

(64.15)
where

ˆ
 contains the ¯n largest singular values. Note that
ˆ
U,
ˆ
, and
ˆ
S are consistent estimates of
U, , and S in the SVD of R.
c

1999 by CRC Press LLC
The SSF step
— Compute the DOA estimate as the minimizing argument of the following
signal subspace fitting criterion:
min
θ
{min
C
[vec (
ˆ
S −
ˆ
W
R
AC)]

ˆ
V [vec (
ˆ

S −
ˆ
W
R
AC)]}
(64.16)
where
ˆ
V is apositivedefinite weighting matrix, and “vec”is the vectorizationoperator
3
. Alternatively,
one can estimate the DOA instead by minimizing the following criterion:
min
θ
{[vec (B

ˆ
W
−1
R
ˆ
S)]

ˆ
W [vec (B

ˆ
W
−1
R

ˆ
S)]}
(64.17)
where
ˆ
W is a positive definite weight, and B ∈ C
m×(m−n)
is a matrix whose columns form a basis
of the null-space of A

(hence, B

A = 0 and rank (B) = m − n). The alternative fitting criterion
above is obtained from the simple observation that Eq. (64.14) along with the definition of B imply
that
B

W
−1
R
S = 0
(64.18)
It can be shown [27] that the classes of DOA estimates derived from Eqs. (64.16) and (64.17),
respectively, are asymptotically equivalent. More exactly, for any
ˆ
V in Eq. (64.16) one can choose
ˆ
W in Eq. (64.17) so that the DOA estimates obtained by minimizing Eq. (64.16) and, respectively,
Eq. (64.17) have the same asymptotic distribution and vice-versa.
In view of the previous result, in an asymptotical analysis it suffices to consider only one of the

two criteria above. In the following, we focus on Eq. (64.17). Compared with Eq. (64.16), the
criterion (64.17) has the advantage that it depends on the DOA only. On the other hand, for a
general array there is no known closed-form parameterization of B in terms of θ. However, as shown
in the following, this is no drawback because the optimally weighted criterion (which is the one to
be used in applications) is an explicit function of θ.
64.4 The Optimal IV-SSF Method
In what follows, we deal with the essential problem of choosing the weights
ˆ
W ,
ˆ
W
R
, and
ˆ
W
L
in
the IV-SSF criterion (64.17) so as to maximize the DOA estimation accuracy. First, we optimize the
accuracy with respect to
ˆ
W, and then with respect to
ˆ
W
R
and
ˆ
W
L
.
Optimal Selection of

ˆ
W
Define
g(θ) = vec (B

ˆ
W
−1
R
ˆ
S)
(64.19)
and observe that the criterion function in Eq. (64.17) can be written as,
g

(θ)
ˆ
Wg(θ)
(64.20)
In [27] it is shown that g(θ) (evaluated at the true DOA vector) has, asymptotically in N, a circularly
symmetric normal distribution with zero mean and the following covariance:
G(θ) =
1
N
[(W
L
U
−1
)


R
z
(W
L
U
−1
)]
T
⊗[B

R
y
B]
(64.21)
3
If x
k
is the kth column of a matrix X, then vec (X) =[x
T
1
x
T
2
··· ]
T
.
c

1999 by CRC Press LLC

×