The implementation of the generalized matrix learning vector quantization.
Basically, this is a prototype-based, supervised learning procedure.
The conventional LVQ was enriched a linear mapping rule provided by a matrix
(G
MLVQ). This matrix has a dimension of
dataDimension x omegaDimension
. The omega dimension can be set
to
2...dataDimension
. Depending on the set omega dimension each
data point and prototype will be mapped (resp. linearly transformed) to an
embedded data space. Within this data space distance between data points and
prototypes are computed and this information is used to compose the update
for each learning epoch. Setting the omega dimension to values significantly
smaller than the data dimension will drastically speed up the learning
process. As mapping to the embedded space of data points is still
computationally expensive, we 'cache' these mappings. By invoking
DataSpaceVector.getEmbeddedSpaceVector(OmegaMatrix)
one can retrieve the
EmbeddedSpaceVector
for this data point according to the specified
mapping rule (provided by the
OmegaMatrix
). As the computation of the
embedding can be quite expensive, results are directly link to the data
points. So they are only calculated once and can then by recalled.
Subsequently, by calling
EmbeddedSpaceVector.getWinningInformation(List)
one can access the
WinningInformation
linked to each embedded space vector. These
information include the distance to the closest prototype of the same class
as the 'asked' data point as well as the distance to the closest prototype of
a different class. This information is crucial in composing the update of
each epoch as well as for the computation of
CostFunction
s.
Also
GMLVQ is capable of generalization, meaning various
CostFunction
s can be optimized. Most notably, it is possible to
evaluate the success of each epoch by consulting the F-measure or
precision-recall values.
Another key feature is the possibility of tracking the influence of
individual features within the input data which contribute the most to the
training process. This is realized by a lambda matrix (defined as
lambda = omega * omega'
). This matrix can be visualized and will
contain the influence of features to the classification on its principal
axis. Other elements describe the correlation between the corresponding
features.
This class takes care of the correct initialization of all parameters and
such of one GMLVQ run. To track input parameters, a internal builder is
employed. Most of the internal tasks are then delegated to the
UpdateManager
which will direct the learning process.