Tải bản đầy đủ (.pdf) (11 trang)

Tài liệu 67 Detection: Determining the Number of Sources pptx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (179.67 KB, 11 trang )

Williams, D.B. “Detection: Determining the Number of Sources”
Digital Signal Processing Handbook
Ed. Vijay K. Madisetti and Douglas B. Williams
Boca Raton: CRC Press LLC, 1999
c

1999byCRCPressLLC
67
Detection: Determining the
Number of Sources
Douglas B. Williams
Georgia Institute of Technology
67.1 Formulation of the Problem
67.2 Information Theoretic Approaches
AIC and MDL

EDC
67.3 Decision Theoretic Approaches
The Sphericity Test

Multiple Hypothesis Testing
67.4 For More Information
References
The processing of signals received by sensor arrays generally can be separated into two problems:
(1) detecting the number of sources and (2) isolating and analyzing the signal produced by each
source. We make this distinction because many of the algorithms for separating and processing array
signals make the assumption that the number of sources is known a priori and may give misleading
results if the wrong number of sources is used [3]. A good example are the errors produced by many
high resolution bearing estimation algorithms (e.g., MUSIC) when the wrong number of sources
is assumed. Because, in general, it is easier to determine how many signals are present than to
estimate the bearings of those signals, signal detection algorithms typically can correctly determine


the number of signals present even when bearing estimation algorithms cannot resolve them. In fact,
the capability of an array to resolve two closely spaced sources could be said to be limited by its ability
to detect that there are actually two sources present. If we have a reliable method of determining
the number of sources, not only can we correctly use high resolution bearing estimation algorithms,
but we can also use this knowledge to utilize more effectively the information obtained from the
bearing estimation algorithms. If the bearing estimation algorithm gives fewer source directions
than we know there are sources, then we know that there is more than one source in at least one of
those directions and have thus essentially increased the resolution of the algorithm. If analysis of the
information provided by the bearing estimation algorithm indicates more source directions than we
know there are sources, then we can safely assume that some of the directions are the results of false
alarms and may be ignored, thus decreasing the probability of false alarm for the bearing estimation
algorithms. In this section we will present and discuss the more common approaches to determining
the number of sources.
67.1 Formulation of the Problem
The basic problem is that of determining how many signal producing sources are being observed by
an array of sensors. Although this problem addresses issues in several areas including sonar, radar,
c

1999 by CRC Press LLC
communications, and geophysics, one basic formulation can be applied to all these applications.
We will give only a basic, brief description of the assumed signal structure, but more detail can be
found in references such as the book by Johnson and Dudgeon [3]. We will assume that an array
of M sensors observes signals produced by N
s
sources. The array is allowed to have an arbitrary
geometry. For our discussion here, we will assume that the sensors are omnidirectional. However,
this assumption is only for notational convenience as the algorithms to be discussed will work for
more general sensor responses.
Theoutputofthemth sensor can be expressed as a linear combination of signals and noise
y

m
(t) =
N
s

i=1
s
i
(
t − 
i
(m)
)
+ n
m
(t) .
The noise observed at the mth sensor is denoted by n
m
(t). The propagation delays, 
i
(m),are
measured with respect to an origin chosen to be at the geometric center of the array. Thus, s
i
(t)
indicates the ith propagating signal observed at the origin, and s
i
(t − 
i
(m)) is the same signal
measured by the mth sensor. For a plane wave in a homogeneous medium, these delays can be found

from the dot product between a unit vector in the signal’s direction of propagation,

ζ
o
i
, and the
sensor’s location, x
m
,

i
(m) =

ζ
o
i
·x
m
c
,
where c is the plane wave’s speed of propagation.
Most algorithms used to detect the number of sources incident on the array are frequency domain
techniques that assume the propagating signals are narrowband about a common center frequency,
ω
o
. Consequently, after Fourier transforming the measured signals, only one frequency is of interest
and the propagation delays become phase shifts
Y
m


ω
o

=
N
s

i=1
S
i

ω
o

e
−jω
o

i
(m)
+ N
m

ω
o

.
The detection algorithms then exploit the form of the spatial correlation matrix, R, for the array.
The spatial correlation matrix is the M × M matrix formed by correlating the vector of the Fourier
transforms of the sensor outputs at the particular frequency of interest

Y =

Y
0

ω
o

Y
1

ω
o

··· Y
M−1

ω
o

T
.
If the sources are assumed to be uncorrelated with the noise, then the form of R is
R = E

YY


= K
n

+ SCS

,
where K
n
is the correlation matrix of the noise, S is the matrix whose columns correspond to the
vector representations of the signals, S

is the conjugate transpose of S, and C is the matrix of the
correlations between the signals. Thus, the matrix S has the form
S =



e
−jω
o

1
(0)
··· e
−jω
o

N
s
(0)
.
.
.

.
.
.
e
−jω
o

1
(M−1)
··· e
−jω
o

N
s
(M−1)



.
If we assume that the noise is additive, white Gaussian noise with power σ
2
n
and that none of the
signals are perfectly coherent with any of the other signals, then K
n
= σ
2
n
I

m
, C has full rank, and the
form of R is
R = σ
2
n
I
M
+ SCS

.
(67.1)
c

1999 by CRC Press LLC
We will assume that the columns of S are linearly independent when there are fewer sources than
sensors, which is the case for most common array geometries and expected source locations. As C
is of full rank, if there are fewer sources than sensors, then the rank of SCS

is equal to the number
of signals incident on the array or, equivalently, the number of sources. If there are N
s
sources, then
SCS

is of rank N
s
and its N
s
eigenvalues in descending order are δ

1
, δ
2
,···, δ
N
s
.TheM eigenvalues
of σ
2
n
I
M
are all equal to σ
2
n
, and the eigenvectors are any orthonormal set of length M vectors. So the
eigenvectors of R are the N
s
eigenvectors of SCS

plus any M − N
s
eigenvectors which complete the
orthonormal set, and the eigenvalues in descending order are σ
2
n
+ δ
1
,···, σ
2

n
+ δ
N
s
, σ
2
n
,···, σ
2
n
.The
correlation matrix is generally divided into two parts: the signal-plus-noise subspace formed by the
largest eigenvalues (σ
2
n
+ δ
1
, ···,σ
2
n
+ δ
N
s
) and their eigenvectors, and the noise subspace formed
by the smallest, equal eigenvalues and their eigenvectors. The reason for these labels is obvious as
the space spanned by the signal-plus-noise subspace eigenvectors contains the signals and a portion
of the noise while the noise subspace contains only that part of the noise that is orthogonal to the
signals [3]. If there are fewer sources than sensors, the smallest M − N
s
eigenvalues of R are all equal

and to determine exactly how many sources there are, we must simply determine how many of the
smallest eigenvalues are equal. If there are not fewer sources than sensors (N
s
≥ M), then none
of the smallest eigenvalues are equal. The detection algorithms then assume that only the smallest
eigenvalue is in the noise subspace as it is not equal to any of the other eigenvalues. Thus, these
algorithms can detect up to M − 1 sources and for N
s
≥ M will say that there are M − 1 sources
as this is the greatest detectable number. Unfortunately, all that is usually known is

R, the sample
correlation matrix, which is formed by averaging N samples of the correlation matrix taken from
the outputs of the array sensors. As

R is formed from only a finite number of samples of R, the
smallest M− N
s
eigenvalues of

R are subject to statistical variations and are unequal with probability
one [4]. Thus, solutions to the detection problem have concentrated on statistical tests to determine
how many of the eigenvalues of R are equal when only the sample eigenvalues of

R are available.
When performing statistical tests on the eigenvalues of the sample correlation matrix to determine
the number of sources, certain assumptions must be made about the nature of the signals. In array
processing, both deterministic and stochastic signal models are used depending on the application.
However, for the purpose of testing the sample eigenvalues, the Fourier transforms of the signals at
frequency ω

o
; S
i

o
), i = 1, ..., N
s
; are assumed to be zero mean Gaussian random processes
that are statistically independent of the noise and have a positive definite correlation matrix C.We
also assume that the N samples taken when forming

R are statistically independent of each other.
With these assumptions, the spatial correlation matrix is still of the same form as in (67.1), except
that now we can more easily derive statistical tests on the eigenvalues of

R.
67.2 Information Theoretic Approaches
We will see that the source detection methods to be described all share common characteristics.
However, we will classify them into two groups—information theoretic and decision theoretic
approaches—determined by the statistical theories used to derive them. Although the decision
theoretic techniques are quite a bit older, we will first present the information theoretic algorithms
as they are currently much more commonly used.
67.2.1 AIC and MDL
AIC and MDL are both information theoretic model order determination techniques that can be
used to test the eigenvalues of a sample correlation matrix to determine how many of the smallest
eigenvalues of the correlation matrix are equal. The AIC and MDL algorithms both consist of
minimizing a criterion over the number of signals that are detectable, i.e., N
s
= 0, ..., M− 1.
c


1999 by CRC Press LLC
To construct these criteria, a family of probability densities, f(Y|θ(N
s
)), N
s
= 0, ..., M − 1,
is needed, where θ, which is a function of the number of sources, N
s
, is the vector of parameters
needed for the model that generated the data Y. The criteria are composed of the negative of the
log-likelihood function of the density f(Y
ˆ
θ(N
s
)),where
ˆ
θ(N
s
) is the maximum likelihood estimate
of θ for N
s
signals, plus an adjusting term for the model dimension. The adjusting term is needed
because the negative log-likelihood function always achieves a minimum for the highest dimension
model possible, which in this case is the largest possible number of sources. Therefore, the adjusting
term will be a monotonically increasing function of N
s
and should be chosen so that the algorithm
is able to determine the correct model order.
AIC was introduced by Akaike [1]. Originally, the “IC” stood for information criterion and the

“A” designated it as the first such test, but it is now more commonly considered an acronym for the
“Akaike Information Criterion.” If we have N independent observations of a random variable with
probability density g(Y) and a family of models in the form of probability densities f(Y|θ)where θ
is the vector of parameters for the models, then Akaike chose his criterion to minimize
I(g; f(·|θ))=

g(Y) ln g(Y)dY−

g(Y) ln f(Y|θ)dY
(67.2)
which is known as the Kullback-Leibler mean information distance.
1
N
AI C(θ) is an estimate of
−E{

g(Y) ln f(Y|θ)dY} and minimizing AI C(θ) over the allowable values of θ should minimize
(67.2). The expression for AI C(θ ) is
AI C(θ) =−2ln

f

Y|
ˆ
θ
(
N
s
)


+ 2η,
where η is the number of independent parameters in θ.
Following AIC, MDL was developed by Schwarz [6] using Bayesian techniques. He assumed that
the a priori density of the observations comes from a suitable family of densities that possess efficient
estimates [7]; they are of the form
f(Y|θ)= exp(θ · p(Y) − b(θ)) .
The MDL criterion was then found by choosing the model that is most probable a posteriori. This
choice is equivalent to selecting the model for which
MDL(θ) =−ln

f

Y|
ˆ
θ
(
N
s
)

+
1
2
η ln N
is minimized. This criterion was independently derived by Rissanen [5] using information theoretic
techniques. Rissanen noted that each model can be perceived as encoding the observed data and that
the optimum model is the one that yields the minimum code length. Hence, the name MDL comes
from “Minimum Description Length”.
For the purpose of using AIC and MDL to determine the number of sources, the forms of the log-
likelihoodfunction andthe adjustingterms have beengivenbyWax[8]. For N

s
signals theparameters
that completely parameterize the correlation matrix R are {σ
2
n

1
, ···,λ
N
s
, v
1
, ···, v
N
s
} where
λ
i
and v
i
, i = 1, ..., N
s
, are the eigenvalues and their respective eigenvectors of the signal-plus-noise
subspace of the correlation matrix. As the vector of sensor outputs is a Gaussian random vector with
correlation matrix R and all the samples of the sensor outputs are independent, the log-likelihood
function of f(Y|θ)is
ln f

Y|σ
2

n

1
, ···,λ
N
s
, v
1
, ···, v
N
s

= π
−pN
(
det R
)
−N
exp

−Ntr

R
−1

R

c

1999 by CRC Press LLC

×