min_cluster_size_cut

PLSCAN.min_cluster_size_cut(cut_size)

Return the clustering produced by a specific minimum cluster size.

Selects all leaf-clusters that are alive at cut_size in the left-open interval (birth, death], i.e. clusters whose birth size is strictly less than cut_size and whose death size is at least cut_size. This is the same selection rule used internally by fit() for the automatically chosen minimum cluster size.

Use persistence_trace_ to identify candidate cut sizes, or use cluster_layers() to obtain clusterings for all persistence peaks at once.

Parameters:

cut_size (float) – Minimum cluster size threshold. Must be 2.0.

Return type:

Labelling

Returns:

  • labels – int64 array of shape (n_samples,). Cluster indices are zero-based; noise points are -1.

  • probabilities – float32 array of shape (n_samples,) with cluster membership probabilities in [0, 1].