compute_medoid_indices

PLSCAN.compute_medoid_indices(labels=None)

Return the index of the medoid point for each cluster.

For each cluster the medoid is the cluster member whose probability-weighted sum of pairwise within-cluster distances is smallest — the point i* in cluster c that minimises:

\[i^* = \operatorname{arg\,min}_{i \in c} \sum_{j \in c} p_j \cdot d(x_i, x_j)\]

where \(p_j\) is the cluster-membership probability of point \(j\) and \(d\) is the fitted distance metric.

For feature-vector inputs, exact pairwise distances are used. For precomputed sparse distance inputs, the average mutual reachability distances are used to compensate for variations in the sparsity of the input graph.

Returns one index per cluster into the original input, making it simple to retrieve any attribute of the medoid point.

Only available for feature-vector or (sparse) precomputed distance inputs. Raises ValueError for precomputed MST inputs and NotFittedError before fitting.

Parameters:

labels (ndarray[tuple[int], dtype[int_]] | None, default: None) – An optional integer array of shape (n_samples,) with cluster labels. When None (default), the fitted labels_ are used.

Return type:

ndarray[tuple[int], dtype[int_]]

Returns:

medoid_indices – Integer array of shape (n_clusters,) with the index of the medoid point for each cluster.