Clustering
Density Based Clustering using EM algorithm
Each cluster is assumed to have a probability density with certain parameters (e.g. Multivariate Gaussian). The goal of Density Based clustering is to determine the number of such model components (i.e. clusters) in a data set, and the parameters of the probability density of each component. Once the components of the whole data set are determined, a Density Based cluster may indicate the probability of each variable belonging to a particular cluster. Number of clusters is determined using cross-validation. Each variable has a probability distributiona indicating the probability of the variable belonging to each of the clusters.
Method parameters
- Data files
- Raw data files correspondent to the samples selected to bi in the projection plot.
- Colouring style
- The dots corresponding to every sample can be colored depending on the sample's parameter state or on the file.
- Peak measuring approach
- It can take two values: height or area. The projections will be calculated using one of this two values.
- Peaks
- Peaks that will be taken into account to create the projection plot.
- Visualization
- The visualization of the result of non hierarchical clustering algorithms can be performed using PCA or Sammon's projection
- Type of data
- It can take two values: Samples or variables. The clustering will be applied to one of this types of data.
- Algorithm
- Algorithm that will be used to cluster the data.
- Link type
- This parameters is only enable when the hierarchical clustering has been chosen. The distances between clusters is determined by the chosen linkage.
- Distance fuction
- This parameters is only enable when the hierarchical clustering has been chosen. The distances between points is determined by the chosen distance function.
- Number of groups
- The number of clusters has to be defined by the user in advance for some clustering algorithms. This parameter is available only when K-means or Farthest First algorithm are chosen.