compute_mutual_spanning_tree

fast_plscan.compute_mutual_spanning_tree(data, *, min_samples=5, space_tree='kd_tree', metric='euclidean', metric_kws=None)

Computes a mutual reachability spanning tree from data features using a KDTree.

Parameters:
  • data (ndarray[tuple[int, int], dtype[single]]) – High dimensional data features. Values must be finite and not missing.

  • space_tree (str, default: 'kd_tree') – The type of spatial tree to use. Valid options are: “kd_tree”, “ball_tree”. See metric for an overview of supported metrics on each tree type.

  • min_samples (int, default: 5) – Core distances are the distance to the min_samples-th nearest neighbor.

  • metric (str, default: 'euclidean') – The distance metric to use. See VALID_KDTREE_METRICS and VALID_BALLTREE_METRICS for lists of valid metrics. See sklearn documentation for metric definitions.

  • metric_kws (dict[str, Any] | None, default: None) – Additional keyword arguments for the distance metric.

Return type:

tuple[KDTree32 | BallTree32, SpanningTree, ndarray[tuple[int, int], dtype[intc]], ndarray[tuple[int], dtype[single]]]

Returns:

  • space_tree – The fitted kd or ball tree object.

  • spanning_tree – A spanning tree of the input sparse distance matrix.

  • indices – A 2D array with knn indices.

  • core_distances – A 1D array with core distances.