Clustering
Simple K-Means
The goal of K-means clustering is to determine k clusters in such a way that intra cluster distances are small and inter cluster distances are large; or in other words, every point is assigned to a cluster whose centre is the nearest. K-means clustering works by randomly choosing k-centroids in the first step and then assigning the data points to the clusters in such a way that every point belongs to the cluster with the nearest centroid, and redetermining the cluster centroids by taking the mean of data points in each cluster. The process is continued until the cluster means converge.
Method parameters
- Data files
- Raw data files correspondent to the samples selected to bi in the projection plot.
- Colouring style
- The dots corresponding to every sample can be colored depending on the sample's parameter state or on the file.
- Peak measuring approach
- It can take two values: height or area. The projections will be calculated using one of this two values.
- Peaks
- Peaks that will be taken into account to create the projection plot.
- Visualization
- The visualization of the result of non hierarchical clustering algorithms can be performed using PCA or Sammon's projection
- Type of data
- It can take two values: Samples or variables. The clustering will be applied to one of this types of data.
- Algorithm
- Algorithm that will be used to cluster the data.
- Link type
- This parameters is only enable when the hierarchical clustering has been chosen. The distances between clusters is determined by the chosen linkage.
- Distance fuction
- This parameters is only enable when the hierarchical clustering has been chosen. The distances between points is determined by the chosen distance function.
- Number of groups
- The number of clusters has to be defined by the user in advance for some clustering algorithms. This parameter is available only when K-means or Farthest First algorithm are chosen.