ClusterNumber Module

Types

Type Description

SilhouetteResult

Functions and values

Function or value Description

calcAIC bootstraps iClustering maxK

Full Usage: calcAIC bootstraps iClustering maxK

Parameters:
Returns: (int * float)[]

Akaike Information Criterion (AIC)

bootstraps : int

iClustering : int -> KClusteringResult<float[]>
maxK : int
Returns: (int * float)[]

Example

calcNMI correctLabels clusteredLabels

Full Usage: calcNMI correctLabels clusteredLabels

Parameters:
    correctLabels : int[] - True data labels represented by integers
    clusteredLabels : int[] - Cluster indices represented by integers

Returns: float Returns a NMI between 0 and 1. With 1 being a perfect match.

Calculates Normalized Mutual Information as a measure for clustering quality

The correctLabels and Clustered Labels must have the same length

correctLabels : int[]

True data labels represented by integers

clusteredLabels : int[]

Cluster indices represented by integers

Returns: float

Returns a NMI between 0 and 1. With 1 being a perfect match.

Example


    let trueLabels = [|"blue";"blue";"yellow";"red";"yellow"|]
    let trueLabelsAsInt = [|1; 1; 3; 2; 3|]
    let clusteredLabels = [|6; 6; 5; 5; 5|]
    let nmi = calcNMI trueLabelsAsInt clusteredLabels
    //results in 0.77897

kRuleOfThumb observations

Full Usage: kRuleOfThumb observations

Parameters:
    observations : seq<'a> -

Returns: float

Simple estimator for number of cluster (k) // can be used as the upper bound for other methods

observations : seq<'a>

Returns: float

Example

silhouetteIndex clusteredData

Full Usage: silhouetteIndex clusteredData

Parameters:
    clusteredData : float[][][] -

Returns: float

Calculates the silhouette score for a clustered data set where the coordinates of each data point is given as float [].
The index ranges from -1 (bad clustering result) to 1 (perfekt clustering result)

clusteredData : float[][][]

Returns: float

Example

silhouetteIndexKMeans bootstraps iClustering data maxK

Full Usage: silhouetteIndexKMeans bootstraps iClustering data maxK

Parameters:
    bootstraps : int -
    iClustering : int -> KClusteringResult<float[]> -
    data : float[][] -
    maxK : int -

Returns: SilhouetteResult[]

The silhouette index can be used to determine the optimal cluster number in k means clustering.
bootstraps indicates the number the k means clustering is performed for each k and maxK indicated the maximal cluster number.

bootstraps : int

iClustering : int -> KClusteringResult<float[]>

data : float[][]

maxK : int

Returns: SilhouetteResult[]

Example