A multi-view relational fuzzy c-medoid vectors clustering algorithm

Francisco de A.T. de Carvalho a,n, Filipe M. de Melo a, Yves Lechevallier b a Centro de Informática, Universidade Federal de Pernambuco, Av. Jornalista Anibal Fernandes, s/n - Cidade Universitária, CEP 50740-560, Recife (PE), Brazil b INRIA Paris-Rocquencourt, Domaine de Voluceau, BP 105, 78153 Le Chesnay, France a r t i c l e i n f o

Article history:

Received 15 January 2014

Received in revised form 7 August 2014

Accepted 1 November 2014

Available online 14 April 2015

Keywords:

Fuzzy c-medoid vectors

Multi-view clustering

Collaborative clustering

Relational clustering

Relevance weights a b s t r a c t

This paper gives a multi-view relational fuzzy c-medoid vectors clustering algorithm that is able to partition objects taking into account simultaneously several dissimilarity matrices. The aim is to obtain a collaborative role of the different dissimilarity matrices in order to obtain a final consensus fuzzy partition. These matrices could have been obtained using different sets of variables and dissimilarity functions. This algorithm is designed to give a fuzzy partition and a vector of medoids for each fuzzy cluster as well as to learn a relevance weight for each dissimilarity matrix by optimizing an objective function. These relevance weights change at each iteration of the algorithm and are different from one cluster to another. Moreover, various tools for interpreting the fuzzy partition and fuzzy clusters provided by this algorithm are also presented. Several examples illustrate the performance and usefulness of the proposed algorithm. & 2015 Elsevier B.V. All rights reserved. 1. Introduction

Clustering methods seek to organize a set of items into clusters such that items within a given cluster have a high degree of similarity, whereas items belonging to different clusters have a high degree of dissimilarity. These methods have been widely applied in fields such as taxonomy, image processing, information retrieval, and data mining. Clustering techniques may be divided into hierarchical and partitioning methods [1,2].

Hierarchical methods yield complete hierarchy, i.e., a nested sequence of partitions of the input data. Hierarchical methods can be agglomerative and divisive. Agglomerative methods yield a sequence of nested partitions starting with trivial clustering in which each item is in a unique cluster and ending with trivial clustering in which all items are in the same cluster. A divisive method starts with all items in a single cluster and performs a splitting procedure until a stopping criterion is met (usually upon obtaining a partition of singleton clusters).

Partitioning methods seek to obtain a single partition of the input data into a fixed number of clusters. These methods often look for a partition that optimizes (usually locally) an objective function. To improve cluster quality, the algorithm is run multiple times with different starting points and the best configuration obtained from the total runs is used as the output clustering.

Partitioning methods can be divided into hard clustering and fuzzy clustering. Hard clustering provides a hard partition in which each object of the dataset is assigned to one and only one cluster.

Fuzzy clustering [3] generates a fuzzy partition that provides a degree of membership of each object in a given cluster. This gives the flexibility to express objects that belong to more than one cluster at the same time [4].

Two usual representations of the objects upon which clustering can be based are feature data and relational data. When each object is described by a vector of quantitative or qualitative values the set of vectors describing the objects is called a feature data.

When each pair of objects is represented by a relationship, then we have relational data. The most common case of relational data is when we have (a matrix of) dissimilarity data, say R¼ ½rkl, where rkl is the pairwise dissimilarity (often a distance) between objects k and l. Another issue is multi-view data, where the objects are represented by several (feature or relational) data matrices.

Multi-view data can be found in many domains such as bioinformatics, and marketing [5].

Several clustering models and algorithms have been proposed aiming to cluster feature data [1,2]. However, few clustering models have been proposed for relational data. Ref. [6] observed that several applications, such as content-based image retrieval, would be strongly benefited by clustering methods for relational data.

This paper gives a multi-view relational fuzzy c-medoid vectors clustering algorithm that is able to give a fuzzy partition of the objects taking into account simultaneously their relational descriptions given by multiple dissimilarity matrices. The main idea is to obtain a collaborative role of the different dissimilarity matrices [7] aiming to obtain a final consensus partition [8]. These dissimilarity

Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/neucom

Neurocomputing http://dx.doi.org/10.1016/j.neucom.2014.11.083 0925-2312/& 2015 Elsevier B.V. All rights reserved. n Corresponding author. Tel.: þ55 81 21268430; fax: þ55 81 21268438.

E-mail addresses: fatc@cin.ufpe.br (F.d.A.T. de Carvalho), fmm@cin.ufpe.br (F.M. de Melo), yves.lechevallier@inria.fr (Y. Lechevallier).

Neurocomputing 163 (2015) 115–123 matrices can be computed using different sets of variables and a fixed dissimilarity function. In this case, the final partition gives a consensus between different views (sets of variables) describing the objects. They can also be computed from a fixed set of variables and different dissimilarity functions (in this case, the final partition gives the consensus between different dissimilarity functions) or even using different sets of variables and dissimilarity functions.

Moreover, the influence of the different dissimilarity matrices is not equally important in the definition of the fuzzy clusters in the final fuzzy partition [6]. Thus, in order to obtain a meaningful fuzzy partition from all dissimilarity matrices, it is necessary to learn cluster dependent relevance weights for each dissimilarity matrix.