Published February 4, 2022 | Version v1
Conference paper Open

A Co-occurrence Based Approach for Mining Overlapped Co-clusters in Binary Data

Description

Co-clustering is a specific type of clustering that addresses the problem of simultaneously clustering objects and attributes of a data matrix. Although general clustering techniques find non-overlapping co-clusters, finding possible overlaps between co-clusters can reveal embedded patterns in the data that the disjoint clusters cannot discover. The overlapping co-clustering approaches proposed in the literature focus on finding global overlapped co-clusters and they might overlook interesting local patterns that are not necessarily identified as global co-clusters. Discovering such local co-clusters increases the granularity of the analysis, and therefore more specific patterns can be captured. This is the objective of the present paper, which proposes the new Overlapped Co-Clustering (OCoClus) method for finding overlapped co-clusters on binary data, including both global and local patterns. This is a non-exhaustive method based on the co-occurrence of attributes and objects in the data. Another novelty of this method is that it is driven by an objective cost function that can automatically determine the number of co-clusters. We evaluate the proposed approach on publicly available datasets, both real and synthetic data, and compare the results with a number of baselines. Our approach shows better results than the baseline methods on synthetic data and demonstrates its efficacy in real data.

Files

BRACIS21_CRC.pdf.pdf

Files (586.6 kB)

Name Size Download all
md5:0e826a83515ed2ce8ffdffd80ae05d0f
586.6 kB Preview Download

Additional details

Funding

European Commission
MASTER – Multiple ASpects TrajEctoRy management and analysis 777695