Aerospace and Electronic Systems Magazine April 2017 - 35
Nassar, Hussein, and Medhat
respectively, where (i) is the observation number). Then,
the model is tested on a subset (Set-T) of the different 270
observations (239 normal, 31 faulty). The PLS-DA model
is built on a training data set by cross-validation.
The PLS-DA models are fitted to the X variables
only. All results of the fit of a PC model, such as sum
of squares (SS) explained are labeled as X (i.e. R2X).
PLS-DA extracts as many components as considered significant using cross-validation. For a PC model, a component is significant if it is cross-validated according to
rule Q2 > Limit (a limit that depends on the number of
components). For a PC model, the limit increases with
subsequent components to account for the loss in degrees
Figure 9 demonstrates the scatter plot of the two score
vectors [t1] and [t2] which indicates the capability of the
model to create unique data clusters, compare the characteristics of each one, and correlate the clusters with
low cardinalities to their corresponding trigger events.
The analysis proves that the discrimination between the
two clusters of normal and faulty is significant. It was
manually highlighted by black line clusters in the figure
for visual purposes. The figure shows a clear view of the
separation in the data along the first component and this
is due to the discriminating power of the model that arises
from the fact that the PLS-DA method maximizes the covariance between the predictor (independent) matrix-X
and the labeled (dependent) matrix-Y for each component of the reduced space. Moreover, the model is fitted
by cross-validation and gets a two component model with
R2X (cum): 0.863, R2Y (cum): 0.979, Q2X (cum): 0.972.
R2X is known as goodness of fit, which is an estimate
of the explanation ability of the model. Q2X is known as
goodness of prediction, which is an estimate of the predictive ability of the model. One can observe that the model
differentiates between the faulty telemetry (F1) with the
others faulty ones and the reasons for that are explained
The second model developed by SIMCA-P software
uses the same data. The model results have also significant
results regarding accuracy and computational complexity.
For better categorization, another attempt has been made
by expanding the matrix-X with square terms, cross product, and cubic terms between the important variables that
represent the most effective influence regarding the fault.
Furthermore, the model is fitted by cross-validation and
gets a two component model with R2X (cum): 0.791, R2Y
(cum): 0.964, Q2X (cum): 0.958. More analysis is conducted by examining the loading plots of both PLS-DA algorithm and SIMCA-P software in order to investigate the
relationships between different variables. Figure 10 shows
the loading plots clarifying the relations between variables.
From Figures 9 and 10, the angular velocities (ωx),
(ωy), (ωz) and quaternion (q2) in the left corner of the
loading plot contribute to the left swarm (faulty states) of
data in the score plot.
Nonlinear Gaussian SVMs contour plots with initialization.
Nonlinear Gaussian SVMs contour plot without initialization.
IEEE A&E SYSTEMS MAGAZINE