Applications of Optimal Subspace Estimation

Optimal Subspace Estimation (OSE) is a technique for estimating the signal subspace of a noisy data matrix or covariance matrix using a specific structure known as shift-invariance to generate an accurate estimate. For the applications in this thesis, the datasets are generated using a line array of sensors and multiple snapshots to produce data which should have the desired structure. The OSE algorithm exploits this structure in the data matrix in order to generate an estimate of the underlying noise-free signal subspace Two applications have been chosen where the covariance matrices of the signal data should exhibit this desired structure. The applications are common signal processing subjects known as Adaptive Beamforming and Space-Time Adaptive Processing (STAP). OSE is an ideal estimation method for the application of Adaptive Beamforming. It is capable of achieving performance better than a commonly used algorithm known as Dominant Mode Rejection (DMR) while using orders of magnitude fewer snapshots . The OSE algorithm stands out in its ability to quickly approach the Cramer-Rao bound with many fewer snapshots of data. This is especially true when the line array of sensors is able to be carefully calibrated. Results are obtained by multiple simulations that outline the overall performance of theoretical data and test the robustness of the chosen algorithms. The application of Space-Time Adaptive Beamforming is chosen because it is essentially a two-dimensional adaptation of the Adaptive Beamforming example. The data matrix now contains angle and timing information which becomes increasingly difficult to generate a good estimate as real world physics are applied to the model. Processing this data quickly becomes computationally inefficient and requires a modified OSE algorithm know as Subspace Averaging (SSA) was needed. With theoretical data a lower bound is quickly approached but like the algorithms used for comparison, practical results are not easily achieved with simple processing.

The applicaton of Optimal Subspace Estimation (OSE) seeks to prove the merit of capitalizing upon the underlying subspace structure of certain matrices.
When regular arrays (uniform linear spacing in one or two dimensions) are employed, a property called shift-invariance exists. OSE produces a subspace estimate that is constrained to have this shift-invariant structure. When this particular structure exists in the data, impressive results are achieved, which may give results that are orders of magnitude better than those obtained with standard covariance estimation techniques.
The first application considered in this thesis is adaptive beamforming. In this application, a single output signal is formed by taking a linear combination of the outputs of each sensor in an array. The weights used to form this linear combination are calculated so that a source signal from a specified direction is seen in the output signal, while source signals from all other directions are greatly attenuated. In order to calculate the weights, one must first estimate the covariance matrix from noisy data, and then extract the "signal" and "noise" subspaces. Many of the current methods use the eigenvectors of the sample covariance matrix to estimate the subspaces. This results in less than optimal results, especially with low sample support (few snapshots). OSE uses the shift-invariant subspace structure of the true (noise-free) subspace to construct a new subspace estimate that is much more accurate than the eigenvector estimate in the low sample support realm. If the array is carefully constructed/calibrated the OSE algorithm quickly approaches the lower bound for the best achievable subspace accuracy. [1] The second application considered in this thesis is Space-Time Adaptive Pro-cessing (STAP). STAP is a method commonly used in radar applications where clutter and jamming are of concern, such as airborne moving target indication (MTI) radar. This space-time clutter is most often colored noise which makes it very difficult to model with the constantly changing ground clutter. The motion of the airborne array causes a Doppler spreading effect that is not present in stationary radar systems which further complicates the problem [2]. Despite the colored noise the application to OSE seemed possible due to the mention of enforcing Toeplitz structures on the radar data in recent papers [3,4].
It is easiest to demonstrate the shift-invariant property used by the OSE algorithm with an example appropriate for the beamforming application. Consider a matrix Y, which is the product of a tall matrix A(θ) of rank r and a wide iid random matrix S, which represents the signal impinging the array [5] Y m×n = A(θ) m×r S r×n .
In this case the matrix dimensions may be thought of where m is the number of array sensors, n is the number of data snapshots, and r is the rank or number of interferers. Most importantly, the matrix A(θ) is shift-invariant, which is defined in the following discussion. For the case of a one-dimensional array, a steering vector a(ω) is constructed corresponding to N = m array elements where ω is the spatial frequency 2π cos θ, and θ is the direction-of-arrival of a source signal: The vector a(ω) is said to have Vandermonde structure because each element in the vector is equal to the element above it times a fixed number, e jω . This structure is present because the source signal is narrow-band with a certain wavelength and the uniform spacing between the sensors in the line array equals half of the wavelength.
For this example, we choose the array to have five sensors, which gives the steering vector the following form, We choose the phase reference of the array to be the center to align with the beamforming example given in chapter 2. This is simply performed by multiplying the array by e −(N −1) ω 2 , where N is the number of sensors: To define the concept of shift invariance, partition a(ω) into its upper and lower portions, denoted a(ω) and a(ω): and notice that a(ω)e jω = a(ω).
That is, the upper and lower subvectors are related by a scale factor. This example may be extended to two sources. We now have, In this case the, A(θ) has two columns, one for each source: In the same fashion as the one-dimensional case we split A into its upper and lower portions: Then, Notice that because A is multiplied by a square, nonsingular matrix, the matrix A is simply a change of basis for the columns of A, and therefore A and A share the same column space [6], A matrix A satisfying equation 11 is said to be shift invariant.
In a similar fashion it is possible to show that principal left singular vectors of Y are also shift-invariant. Looking at the noise-free case and taking its singular value decomposition we have The singular value decomposition of Y has the following structure and dimensions An underlying property of the SVD is that col(A) = col(U 1 ) because the columns of col(U 1 ) are a basis for the column space of Y . Using this property we can also show that, This is an important result because it shows that the left-singular vectors in U 1 are also shift-invariant, and are a basis for the col(A). Unfortunately, with the more realistic case where noise is added to Y, the singular vectors of the noisy matrix no longer retain the shift-invariant property. With this in mind, a noisy matrix is constructed that is the summation of a low-rank shift invariant noise-free data matrix Y of rank r, with additive independent and identically distributed (iid) white noise N.
The SVD of Y is where col( U 1 ) is the perturbed signal subspace. Note that the perturbed signal subspace is not shift-invariant and the diagonal of Σ is no longer 0 beyond the rank of the underlying matrix due to the additive noise. A basisÛ 1 for the unperturbed signal subspace can be written in terms of the perturbed singular vectors as, for some coefficient matrix C. The OSE algorithm, which is based on a first-order matrix perturbation expansion, finds a matrix C of the following form that makes the matrixÛ 1 shift-invariant [7,8]: whereẐ is a matrix computed from the SVD of Y. We now have an estimate of the noise-free left singular vectors, and this estimate provides an asymptotically efficient estimator of the signal subspace [1].

Adaptive Beamformer
The purpose of an adaptive beamformer is to preserve signals arriving in a desired direction while attenuating loud interferers in the sensor array response. The loud interferers must be attenuated because the desired signal is often very small in relation. This response is accomplished by modifying the received sensor array beampattern by estimating the direction of the interferer and creating a null in the signal response in that direction. For a single interferer, this problem is characterized by the ratio of the power of the interferer and the power of the background noise, defined as the interference-to-noise ratio (INR). As the INR decreases, the number of snapshots needed to obtain a good estimate of the interferer direction becomes unreasonable [1].
The minimum variance distortionless response beamformer (MVDR), otherwise known as a Capon beamformer, is used for the following simulations. The beamformer weight-vector calculation uses the inverse of the interference-plus-noise covariance matrix in its calculation. In practice this matrix is unknown and must be estimated. A common choice for this estimate is the sample covariance matrix (SCM), which gives very poor results in the realm of low sample support. For this reason, a better estimate is needed to give favorable results to the beamformer output. For the beamforming example the SCM is calculated as follows [2] where L is the number of data snapshots and y l is the lth data snapshot or signal received by the array.
The response of the MVDR beamformer for a planar wave impinging on the sensor array is given using the form found in [1,3]. The N x1 steering vector, or array response, given on an element by element basis is where the desired listen direction is θ = θ m . The beamformer response in the listen direction should be unity, with deep nulls placed in the direction of interferers. The vector representing the direction of interest will be referenced to as v m . The actual beamformer output is computed with a weight vector that achieves this result and is given by The theoretical MVDR weight-vector is computed using the ensemble covariance matrix (S EN S ) which is made up of the known interference plus noise. This weight vector is given as In practice the weight-vector must be computed using some estimate of the ensemble covariance matrix. As previously mentioned, the ensemble covariance matrix cannot be replaced with the sample covariance matrix when a small number of data snapshots are available. The first improved estimate which is used for comparison to OSE is the dominant mode rejection beamformer (DMR).

Dominant Mode Rejection Beamformer
The dominant mode rejection beamformer generates a better estimate of the ensemble covariance matrix by replacing the small eigenvalues of the SCM with a weighted average. Following the examples for DMR outlined in [1,3], the first step of the DMR algorithm is to take the eigendecomposition of the sample covariance matrix where E contains the eigenvectors of the SCM and Λ = diag( λ 1 , λ 2 , ..., λ N ) is a diagonal matrix of eigenvalues. The D large eigenvalues which are related to the interferers of Λ are retained, with D being the number of interferers present. The remaining eigenvalues correspond to the additive noise, and are replaced by the average of the estimated noise variance. This gives the new covariance matrix estimate Substituting equation 24 into equation 22 for S EN S generates the DMR weight vector. This weight vector is used in the beamformer simulations to quantify the performance of the DMR algorithm. The performance is measured by using the DMR weight vector to calculate the notch depth created in the beampattern steered to the angle of interest. The weight vector is calculated with the angle of interest, θ m , which is represented as w(θ m ). For the case of a single interferer arriving from the angle θ 1 , the array response v(θ 1 ) is generated. These two vectors are then used to generate the notch depth in decibels directed towards the angle of interest.

Optimal Subspace Estimation Beamformer
The OSE algorithm may be used to obtain an estimate of the "signal subspace" of the ensemble covariance matrix. The procedure is outlined as follows [4].
The OSE calculations begin with an eigendecomposition of the sample covariance matrix with eigenvalues and their respective eigenvectors sorted from largest to smallest, is partitioned as follows: The OSE subspace estimate is whereẐ is computed following the method outlined in Appendix A. Let P and P ⊥ be the orthogonal projection matrices onto col(X 1 ) and its orthogonal complement, respectively: The OSE covariance estimate is then: whereσ 2 is estimated in the same manner as DMR using equation 25. The notch depth is calculated by substituting the following weight vector into equation 26

Simulation Results
Simulations for the DMR and OSE algorithms were completed in MATLAB.
Ideal arrays were first simulated to determine the maximum notch depths achievable for each algorithm. To allow comparison to the simulation results in [1,3] the same parameters are used. A uniform linear array of N = 50 sensors at halfwavelength spacing, with the sensor array response from (20) and look direction θ m = π 2 . A single interferer is introduced in the direction of θ 1 = acos(3/N ) which places the interferer in the main side lobe of the array response using three interference to noise ratios of 10, 20, and 30 dB. Each INR level is computed using 1000 Monte Carlo simulations. The results of each INR level are plotted for both algorithms.
As can be observed in the following plots, the OSE algorithm produces a given notch depth with far fewer snapshots than the DMR algorithm. For each INR value, the OSE algorithm produces a given notch depth using up to two orders of magnitude fewer snapshots than the DMR algorithm.       In all of the perturbation simulations the DMR algorithm is nearly insusceptible to disturbance. This makes DMR well suited to applications where the array is not well calibrated. However, for applications that require deep notch depths to be generated with a small number of snapshots (fast moving objects), OSE is far desirable. For this exceptional performance to be achieved, a well-calibrated array and sensor circuitry must be used.

CHAPTER 3
Space-Time Adaptive Processing

STAP Overview
Space-time adaptive processing (STAP) is a method commonly used to reject radar clutter or jamming that involves an airborne sensor array. The radar clutter has a much stronger return than the objects of interest and is most commonly generated by stationary objects on the ground such as buildings and vegetation.
With a moving array it becomes increasingly difficult to filter out the zero Doppler objects (clutter) due to Doppler spreading which is induced by the array motion.
The main goal for reducing this clutter is to gain the ability to detect slow moving ground based targets. The motivation behind using STAP for this thesis application is the continued work performed in this area of ground clutter returns without having prior knowledge of the true clutter covariance matrix [1,2].
There are many methods used for simulating the radar clutter signal, this thesis looks at two simulation scenarios that have recently been used for . The simpler problem of nulling jammers (similar to the prior ABF application) involves a simulation technique that represents the true covariance matrix with jammers from various angles [3,1]. The second simulation model commonly used to test covariance estimation techniques is known as KASSPER and involves the much more difficult issue of rejecting ground clutter to form a good estimate of the true covariance matrix [2,4].
It was found early into using the KASSPER data that the OSE algorithm is computationally inefficient for use with STAP and due to real-world effects the data does not have the necessary structure to provide a good estimate in the region of interest. This led to Dr. Vaccaro developing a new variation of the algorithm that provides a sub-optimal estimate and is known as Subspace Averaging (SSA).

Introduction to Subspace Averaging
Consider two n × r matrices X and Y, each having rank r. A necessary and sufficient condition for the col(X) = col(Y) is This leads to the formulation of the Subspace Averaging Algorithm (SSA).
Given a set of N matrices, n × r, each of rank r: X 1 , X 2 , · · · , X N , let W 1 = X 1 and W k = P W 1 (X k ), k = 2, · · · , N A basis for the average subspace is given by For a uniform line array, partition the U 1 matrix (taken from the SVD of the SCM in the same manner as equation 15) into submatrices X 1 , X 2 , · · · , X N , where N = shifts + 1. The "shifts" represent the number of rows that are to be left out of Given the case where shifts = 1, we have the same partitioning of U 1 that is employed in the calculation of the OSE algorithm.
For shifts = 3, Figure 14. Example of Subspace selection for Shift=3 The basis matrix B will be (M − shifts) × r. The method to to construct the full subspace estimate involves first forming P = B(B T B) −1 B T . The shifted subspaces are then recombined in the following fashion In this example, each row of V w is computed using one, two, three, or four estimates. Each row of V w must be divided by the number of nonzero estimates.
Thus, the final subspace estimate is where, Multiple covariance estimation techniques found in [2,1]

Structured Covariance Matrix Simulation
The first covariance simulation method constructs a covariance matrix with similar form to the KASSPER data. The array snapshots come from an N-element array that also exhibits a similar structure to the beamforming application of Chapter 2. This covariance matrix is generated with any number of jammers and has the following form. [3] where σ 2 i is the normalized power of the i th jammer, β is the bandwidth of the desired signal and is assumed to be equal to the jammer signal's bandwidth. The i th jammer signal is constructed using φ i = 2πd(sinθ)/λ 0 , where d is the array interelement spacing, θ is the angle off boresight, λ 0 is the jammer center frequency wavelength. δ nm represents the internal noise power of the array δ nm = 1 when n = m, and 0 otherwise.
Two values of β are simulated to show the effect the jammer bandwidth has on the SINR value. When β = 0 a narrow-band jammer signal is represented, β = 0.03 gives the jammer a 3% signal bandwidth making it more wideband.
As the band becomes wider it becomes increasingly difficult to resolve the true covariance matrix.
It is of note that this simulation actually represents a one-dimensional example and does not suffer the same performance issues found with the ground clutter example. shown in [2] are full-rank or slightly rank deficient. Using large sample support with the KASSPER data is not ideal because their are targets of interest in each range bin. This means the assumption that the adjacent ranges to the range bin of interest does not hold up and corrupted data is therefore used in the average. In the KASSPER data the clutter samples for each range bin are simulated as follows [4],

Structured Covariance Matrix Results
where the space-time steering vector is v(θ p ). The desired angle of arrival and Doppler frequency are θ p and f p , respectively. P cc represents the number of scatterers in each range bin l. α p is a random, complex, Gaussian number that has a variance that corresponds directly to the predicted powers in SCATS. (SCATS is the Information Systems Laboratories Splatter, Clutter, and Target Signal model which is a phenomenology modeling tool used to characterize the ground clutter in the KASSPER simulation).
The space-time steering vector is the Kronecker product of two vectors, where b(f p ) is the temporal steering vector for Doppler frequency f p , and a(θ p ) is the array response at angle of arrival θ p . These vectors have the following form, where a uniform linear arrray is assumed, φ(θ p ) is the relative phase shift to the first element of the array arriving at the angle of arrival θ p . T r is the pulse repition interval (PRI), which is equal to the reciprocal of the pulse repitition frequency The true clutter covariance matrix is formed as follows, The sample covariance matrix is computed as follows, where K is the is the number of range bins used for calculating X = [x 1 x 2 · · · x K ].
The Sample Matrix Inververse (SMI) algorithm directly uses the SCM in its calculations [5]. This algorithm has good results with large sample support but becomes non-invertible when the estimate is less than full rank. This issue is resolved by incorporating diagonal loading which adds a small constant value c, to the diagonal entries of the covariance estimate. The only constraint on this value is it must be large enough to make the matrix invertible.
The FML algorithm is easily implemented following the procedure outlined in [3] . The first step in the FML algorithm is to take the eigen decompositon of the SCM such that, where Λ is ordered from largest to smallest eigenvalues.
The eigenvalues of the SCM are then manipulated to replace the less significant values with the noise floor. With the case of the KASSPER data the noise variance of 1 is known and the noise-floor does not need to be estimated. For radar applications, prior knowledge of the noise floor is not uncommon as it can be measured by placing the radar in receive only mode when it is first powered up [2] . Let Λ F M L be the matrix Λ from equation 45 but with all diagonal entries less than the noise variance σ 2 replaced by σ 2 The FML covariance matrix is then computed such that, The Rank Constrained Maximum Likelihood (RCML) covariance estimation algorithm [2] uses a very similar approach to FML. Rather than replacing values of the eigenvalue matrix that are less than the noise floor estimate with the estimate it, incorporates knowledge of the covariance matrix rank. All values in the estimated covariance eigenvalue matrix greater than the determined rank are replaced with the noise variance. The rank in this case is estimated using Brennan's rule where, r = N + M − 1 = 11 + 32 − 1 = 42 [6]. The new eigenvalue matrix is easily represented as and, Finally, the SSA algorithm is calculated similarly to the Adaptive Beamforming case where the estimate is taken by shifting and averaging. The singular value decomposition is computed such that, using MATLAB

KASSPER Simulation Results
The KASSPER dataset aims to realistically model the clutter effects of an actual Side-Looking Airborne Radar System (SLAR) within a specific region of the United States . This region is modeled as a mountainous area of California that provides large changes in the clutter power returns to the radar. This model includes real-world effects such as heterogenous terrain, sub-space leakage, array errors, and a multitude of ground targets [4]. The real-world effects incorporated into the model make it difficult to estimate the covariance matrices, providing an accurate representation of the effectiveness of the different techniques employed.
The majority of the algorithms tested for the STAP processing method are computationally inefficent and/or require a large number of samples to generate a good estimate of the clutter. None of methods tested perform well in the real-world scenario simulated by the KASSPER dataset.
When computing the sample covariance matrix for STAP applications it is necessary to leave out the range cell being tested at a minimum. By leaving this range cell out it ensures the target of interest is not included in the estimate. It is also common to leave out adjacent cells to the range bin under test for the case where the target may extend between multiple range bins. These left out range bins are referred to as guardcells.
The sample covariance matrix is modified to reflect this change, where r i is the range bin of interest, g is the number of guardcells (must be an even number), and K is the number of snapshots or range bins to be averaged (minus the target and guardcells).
The Normalized SINR vs. Angle and Doppler is commonly used in radar literature to evaluate the performance of estimation techniques and is calculated as follows [2],  The applications for this thesis were chosen because it was known the signal data is processed from the array in a manner which allows the OSE algorithm to use the shift-invariant property. Both of these applications apply directly to array processing but the application of OSE and the SSA algorithm may prove useful for any application where a subspace estimate is to be generated from the unperturbed signal subspace. In relation to both applications another avenue of research may focus on ensuring the data received by the array is shift-invariant. Array perturbations have an adverse effect on the performance of the OSE algorithm so it may be possible to minimize the impact of these array perturbations. This may involve performing some form of prefiltering to the received data to restore the shift-invariance that should have been present from the uniform line array.

APPENDIX A
Derivation of X 1 for OSE Beamforming The singular value decomposistion of the sample covariance matrix which has been generated from the N × L snapshot matrix, N being the number of array elements, and L the number of snapshots, is taken such that where U is N xN , S is N xL, and V is LxL. U is then partitioned such that where (W) is the orthonormal basis of P ⊥1 and ⊗ is the Kronecker product. Zhat = reshape(z, N-r,r) ; %X1 must be calculated from SVD of SCM not signal X1 = U1 -U2 * Zhat * inv(S1) ; % inv(S1) = Lamˆ.-(1/2) % single interferer optimal subspace estimate X1 = orth(X1) ; end