Robust Statistics and Regularization for Feature Extraction and UXO Discrimination
Dr. Laurens Beran | Black Tusk Geophysics
Current methods for unexploded ordnance (UXO) discrimination using magnetic and electromagnetic induction (EMI) data generally rely on feature vectors extracted from physics-based dipole models. These feature vectors are obtained by solving an inverse problem that provides a “best-fit” to the observed data. Typically, this best-fit is defined as the model that minimizes the sum-of-squares of the residuals between observed and predicted data, with each residual weighted by an estimated standard deviation (the-so-called L2 norm). Thus, there is an implicit assumption that the residuals are normally distributed (Gaussian) and that the maximum likelihood solution is the most appropriate model to extract from the data. This assumption of Gaussian statistics may not be appropriate if the residuals have outliers (due to sensor or positional glitches) or if the residuals contain significant structure (model not adequate to represent the data). In those cases, the predicted feature vectors may be significantly in error and should not be relied upon for discrimination. In addition, the maximum likelihood solution does not account for any uncertainty in the recovered feature vectors and may not be the most appropriate criterion to use to assess UXO likelihood.
The objective of this project was to research the statistical structure of the underlying inversion process and develop methods for more accurate extraction of feature vectors from multi-time, multi-frequency, and multi-component EMI data.
This project explored four approaches with the first three involving different treatments of Bayes equation for combining a-priori knowledge with the constraints imposed by the observed data: (1) robust-statistical methods; (2) regularization methods; (3) incorporating uncertainty into the classification problem; and (4) determining when to stop digging.
The first approach was to use robust-statistical norms that down-weight the influence of outliers and result in recovered model parameters that are less sensitive to a few abnormal data-points. Robust-statistical methods effectively use a likelihood function that has fatter tails than the Gaussian distribution corresponding to the L2 norm.
The next approach was to incorporate prior information into the model parameter estimation problem. UXO are typically ferrous and axially symmetric, which results in one large and two smaller and equal polarizabilities. A parameter extraction routine that is biased towards recovering models with minimal difference between secondary polarizabilities will minimize the chance that a target of interest (TOI) is mistaken as harmless scrap. Incorporation of a-priori information results in a regularization problem that involves a trade-off between fitting the data and satisfying the model parameter penalty term. A regularized inversion algorithm was developed that penalizes the deviation between secondary polarizabilities. Rather than selecting a single model from this inversion process, this study input all models into a support vector machine classifier.
Using the single model that maximizes the a-posteriori probability does not account for any uncertainty in the recovered model parameters when developing a UXO classification strategy. This study explored methods for explicitly incorporating model parameter uncertainty in the classification process. Effectively, this involved using the a-posteriori probability to appraise the ensemble of potential models that could have generated the observed data.
The final issue addressed in this project was the determination of a stop-digging point. It is clear that regulators do not want to leave hazardous items in the ground, so any strategy for determining an optimal operating point must attempt to recover all TOI. Ideally the stop-dig point lies just after the last TOI has been excavated to prevent excessive numbers of clutter items from being removed.
Robust statistical methods were able to improve the false-alarm rates encountered at the former Camp Butner, North Carolina, when using both the EM61 production mode data and the MetalMapper cued-interrogation data. In both cases, the primary contribution of the robust-statistical method was the prevention of outliers in the TOI class.
The researchers found that the regularized method can improve initial performance on high signal-to-noise targets with well-constrained secondary polarizabilities while preventing the occurrence of outlying TOI that arise when unregularized parameters throughout the diglist are relied on. The greatest benefit in discrimination performance is obtained with sensor data that interrogates all polarizabilities with orthogonal (horizontal and vertical) primary fields (i.e., MetalMapper).
EM61 and EM63 data sets acquired at the former Camp Sibert, Alabama, were used to apply parameter uncertainty methods. For both EM61 and EM63 data sets, the area under the receiver operating characteristic (ROC) curve and the false alarm rate at probability of detection (Pd) = 1 (i.e., the proportion of false positives required to identify all true positives) are improved. While the improvement for the EM63 data appears negligible (Probability of false positive (Pfp) reduced from 0.03 to 0), the identification of one outlying UXO (4.2" mortar) is a significant result from the perspective of a regulator charged with site remediation. Similarly, the significant reduction in false alarm rate at Pd = 1 for the EM61 data (Pfp reduced from 0.35 to 0.08) improves the likelihood that all ordnance will be identified with this sensor.
In simulations and applications of a stop-dig point technique to real data, researchers found that this technique has an improved probability of finding all ordnance in a test data set, relative to previously published methods. The researchers have limited investigations to samples on the order of N = 103, which is representative of the number of detected targets at many sites. Tests on larger data sets should still be carried out.
The principal contribution of this project was to develop algorithms and strategies that minimize or eliminate the discrimination outliers encountered during live-site tests. That is, the methods were particularly efficacious when applied to the “hard” anomalies encountered at a site. By minimizing or eliminating outliers in a UXO discrimination strategy, the greatest concerns of the regulatory community can be alleviated: that hazardous UXO are left in the ground at the end of the remediation process.