Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published September 1, 2013 | Version 17113
Journal article Open

Clustering of Variables Based On a Probabilistic Approach Defined on the Hypersphere

Description

We consider n individuals described by p standardized variables, represented by points of the surface of the unit hypersphere Sn-1. For a previous choice of n individuals we suppose that the set of observables variables comes from a mixture of bipolar Watson distribution defined on the hypersphere. EM and Dynamic Clusters algorithms are used for identification of such mixture. We obtain estimates of parameters for each Watson component and then a partition of the set of variables into homogeneous groups of variables. Additionally we will present a factor analysis model where unobservable factors are just the maximum likelihood estimators of Watson directional parameters, exactly the first principal component of data matrix associated to each group previously identified. Such alternative model it will yield us to directly interpretable solutions (simple structure), avoiding factors rotations.

Files

17113.pdf

Files (307.4 kB)

Name Size Download all
md5:613220fca85c47a2ee6dfb301780d93f
307.4 kB Preview Download

Additional details

References

  • B. S. Everitt. Cluster Analysis, London: Arnold, 1993.
  • E. M. Qannari, E. Vigneau, P. Luscan, A. C. Lefebvre and F. Vey. Clustering of variables: application in consumer and sensory studies. Food Quality and Preference, 8, 5/6, 423-428, 1997.
  • E. Vigneau and E. M. Qannari. Clustering of variables around latent components. Communications in Statistics - Simulation and Computation, 32, 4, pp. 1131-1150, 2003.
  • H. Hotelling. Analysis of a complex of statistical variables into principal components. J. Educational Psychology, 24, pp. 417-441, 1933.
  • Y. Escoufier. Le traitement des variables vectorielles. Biometrics, 29, pp. 751-760, 1973.
  • P. Gomes. Distribution de Bingham sur la n-sphere: une nouvelle approche de l' Analyse~Factorielle, Thèse D' État Université des Sciences et Techniques du Languedoc-Montpellier, 1987.
  • A. Figueiredo. Classificação de variáveis no contexto de um modelo probabilístico definido na n-esfera. Tese de Doutoramento em Estatística e Investigação Operacional na especialidade de Estatística Experimental e Análise de Dados, Faculdade de Ciências, Universidade de Lisboa, 2000.
  • K. Mardia and P. E. Jupp. Directional Statistics, 2nd edition, Wiley: Chichester, 2000.
  • A. Figueiredo and P. Gomes. Power of tests of uniformity defined on the hypersphere. Communications in Statistics -Simulation and Computation, 22, 1, pp. 87-94, 2003. [10] A. Figueiredo and P. Gomes. Performance of the EM algorithm on the identification of a mixture of Watson distributions defined on the hypersphere. REVSTAT-Statistical Journal, 4, 2, p. 19, 2006, [11] A. Figueiredo and P. Gomes. Goodness-of-fit methods for the bipolar Watson distribution defined on the hypersphere. Statistics and Probability Letters, 76, pp. 142-152, 2006. [12] P. Gomes and A. Figueiredo. "A new probabilistic approach for the classification of normalized variables". In Contributed Papers of the Bulletin of the 52nd Session of the International Statistical Institute, vol. LVIII, Book 1, pp. 403-404, 1999.