Clustering

Density Based Clustering using EM algorithm

Each cluster is assumed to have a probability density with certain parameters (e.g. Multivariate Gaussian). The goal of Density Based clustering is to determine the number of such model components (i.e. clusters) in a data set, and the parameters of the probability density of each component. Once the components of the whole data set are determined, a Density Based cluster may indicate the probability of each variable belonging to a particular cluster. Number of clusters is determined using cross-validation. Each variable has a probability distributiona indicating the probability of the variable belonging to each of the clusters.

Method parameters

Data files
Raw data files correspondent to the samples selected to bi in the projection plot.
Colouring style
The dots corresponding to every sample can be colored depending on the sample's parameter state or on the file.
Peak measuring approach
It can take two values: height or area. The projections will be calculated using one of this two values.
Peaks
Peaks that will be taken into account to create the projection plot.
Visualization
The visualization of the result of non hierarchical clustering algorithms can be performed using PCA or Sammon's projection
Type of data
It can take two values: Samples or variables. The clustering will be applied to one of this types of data.
Algorithm
Algorithm that will be used to cluster the data.
Link type
This parameters is only enable when the hierarchical clustering has been chosen. The distances between clusters is determined by the chosen linkage.
Distance fuction
This parameters is only enable when the hierarchical clustering has been chosen. The distances between points is determined by the chosen distance function.
Number of groups
The number of clusters has to be defined by the user in advance for some clustering algorithms. This parameter is available only when K-means or Farthest First algorithm are chosen.