Deedle


Stats

Namespace: Deedle

The Stats type contains functions for fast calculation of statistics over series and frames as well as over a moving and an expanding window in a series.

The resulting series has the same keys as the input series. When there are no values, or missing values, different functions behave in different ways. Statistics (e.g. mean) return missing value when any value is missing, while min/max functions return the minimal/maximal element (skipping over missing values).

Remarks

The windowing functions in the Stats type support calculations over a fixed-size windows specified by the size of the window. If you need more complex windowing behavior (such as window based on the distance between keys), different handling of boundary, or chunking (calculation over adjacent chunks), you can use chunking and windowing functions from the Series module such as Series.windowSizeInto or Series.chunkSizeInto.

Table of contents

Expanding windows 

Expanding window means that the window starts as a single-element sized window and expands as it moves over the series. In this case, statistics is calculated for all values up to the current key. This means that the result is attached to the key at the end of the window. The function names are prefixed with expanding.

Static members

Static memberDescription
Stats.expandingCount(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains counts over expanding windows (the value for a given key is calculated from all elements with smaller keys).

Stats.expandingKurt(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains kurtosis over expanding windows (the value for a given key is calculated from all elements with smaller keys); If the entire window contains fewer than 4 values, the result is missing.

Stats.expandingMax(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains maximum over an expanding window. The value for a key k in the returned series is the maximum from all elements with smaller keys. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.expandingMean(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains means over expanding windows (the value for a given key is calculated from all elements with smaller keys); If the entire window contains no values, the result is missing.

Stats.expandingMin(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains minimum over an expanding window. The value for a key k in the returned series is the minimum from all elements with smaller keys. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.expandingSkew(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains skewness over expanding windows (the value for a given key is calculated from all elements with smaller keys); If the entire window contains fewer than 3 values, the result is missing.

Stats.expandingStdDev(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains standard deviation over expanding windows (the value for a given key is calculated from all elements with smaller keys); If the entire window contains fewer than 2 values, the result is missing.

Stats.expandingSum(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains sums over expanding windows (the value for a given key is calculated from all elements with smaller keys); If the entire window contains no values, the result is 0.

Stats.expandingVariance(series)
Signature: series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains variance over expanding windows (the value for a given key is calculated from all elements with smaller keys); If the entire window contains fewer than 2 values, the result is missing.

Frame statistics 

The standard functions are exposed as static members and are overloaded. This means that they can be applied to both Series<'K, float> and to Frame<'R, 'C>. When applied to data frame, the functions apply the statistical calculation to all numerical columns of the frame.

Static members

Static memberDescription
Stats.count(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,int>
Type parameters: 'R, 'C

For each column, returns the number of the values in the column. This excludes missing values and values created from Double.NaN etc.

Stats.kurt(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the kurtosis of the values in a series. The function skips over missing values and NaN values. When there are less than 4 values, the result is NaN.

Stats.max(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the maximal values as a series. The function skips over missing and NaN values. When there are no values, the result is NaN.

Stats.mean(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the mean of the values in the column. The function skips over missing values and NaN values. When there are no available values, the result is NaN.

Stats.median(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the median of the values in the column.

Stats.min(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the minimal values as a series. The function skips over missing and NaN values. When there are no values, the result is NaN.

Stats.skew(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the skewness of the values in a series. The function skips over missing values and NaN values. When there are less than 3 values, the result is NaN.

Stats.stdDev(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the standard deviation of the values in the column. The function skips over missing values and NaN values. When there are less than 2 values, the result is NaN.

Stats.sum(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the sum of the values in the column. The function skips over missing values and NaN values. When there are no available values, the result is 0.

Stats.variance(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,float>
Type parameters: 'R, 'C

For each numerical column, returns the variance of the values in the column. The function skips over missing values and NaN values. When there are less than 2 values, the result is NaN.

Moving windows 

Moving window means that the window has a fixed size and moves over the series. In this case, the result of the statisitcs is always attached to the last key of the window. The function names are prefixed with moving.

Static members

Static memberDescription
Stats.movingCount size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains counts over a moving window of the specified size. The first size-1 elements of the returned series are always missing; if the entire window contains missing values, the result is 0. Throws a FormatException or an InvalidCastException if the value type of the specified series is not convertible to floating point number.

Stats.movingKurt size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains kurtosis over a moving window of the specified size. The first size-1 elements of the returned series are always missing; if the entire window contains missing values, the result is also missing. Throws a FormatException or an InvalidCastException if the value type of the specified series is not convertible to floating point number.

Stats.movingMax size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains maximum over a moving window of the specified size. The first size-1 elements are calculated using smaller windows spanning over 1 .. size-1 values. If the entire window contains missing values, the result is missing. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.movingMean size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains means over a moving window of the specified size. The first size-1 elements of the returned series are always missing; if the entire window contains missing values, the result is also missing. Throws a FormatException or an InvalidCastException if the value type of the specified series is not convertible to floating point number.

Stats.movingMin size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains minimum over a moving window of the specified size. The first size-1 elements are calculated using smaller windows spanning over 1 .. size-1 values. If the entire window contains missing values, the result is missing. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.movingSkew size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains skewness over a moving window of the specified size. The first size-1 elements of the returned series are always missing; if the entire window contains missing values, the result is also missing. Throws a FormatException or an InvalidCastException if the value type of the specified series is not convertible to floating point number.

Stats.movingStdDev size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains standard deviations over a moving window of the specified size. The first size-1 elements of the returned series are always missing; if the entire window contains missing values, the result is also missing. Throws a FormatException or an InvalidCastException if the value type of the specified series is not convertible to floating point number.

Stats.movingSum size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains sums over a moving window of the specified size. The first size-1 elements of the returned series are always missing; if the entire window contains missing values, the result is 0. Throws a FormatException or an InvalidCastException if the value type of the specified series is notconvertible to floating point number.

Stats.movingVariance size series
Signature: size:int -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Returns a series that contains variance over a moving window of the specified size. The first size-1 elements of the returned series are always missing; if the entire window contains missing values, the result is also missing. Throws a FormatException or an InvalidCastException if the value type of the specified series is not convertible to floating point number.

Multi-level statistics 

For a series with multi-level (hierarchical) index, the functions prefixed with level provide a way to apply statistical operation on a single level of the index. (For example you can sum values along the 'K1 keys in a series Series<'K1 * 'K2, float> and get Series<'K1, float> as the result.)

Static members

Static memberDescription
Stats.levelCount level series
Signature: (level:('K -> 'L)) -> series:Series<'K,'V> -> Series<'L,int>
Type parameters: 'K, 'L, 'V

For each group with equal keys at the level specified by level, returns the number of the values in the group. This excludes missing values and values created from Double.NaN etc.

Stats.levelKurt level series
Signature: (level:('K -> 'L)) -> series:Series<'K,'V> -> Series<'L,float>
Type parameters: 'K, 'L, 'V

For each group with equal keys at the level specified by level, returns the kurtosis of the values in a series. The function skips over missing values and NaN values. When there are less than 4 values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.levelMean level series
Signature: (level:('K -> 'L)) -> series:Series<'K,'V> -> Series<'L,float>
Type parameters: 'K, 'L, 'V

For each group with equal keys at the level specified by level, returns the mean of the values in the group. The function skips over missing values and NaN values. When there are no available values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.levelMedian level series
Signature: (level:('K -> 'L)) -> series:Series<'K,'V> -> Series<'L,float>
Type parameters: 'K, 'L, 'V

For each group with equal keys at the level specified by level, returns the median of the values in the group. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.levelSkew level series
Signature: (level:('K -> 'L)) -> series:Series<'K,'V> -> Series<'L,float>
Type parameters: 'K, 'L, 'V

For each group with equal keys at the level specified by level, returns the skewness of the values in a series. The function skips over missing values and NaN values. When there are less than 3 values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.levelStdDev level series
Signature: (level:('K -> 'L)) -> series:Series<'K,'V> -> Series<'L,float>
Type parameters: 'K, 'L, 'V

For each group with equal keys at the level specified by level, returns the standard deviation of the values in the group. The function skips over missing values and NaN values. When there are less than 2 values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.levelSum level series
Signature: (level:('K -> 'L)) -> series:Series<'K,'V> -> Series<'L,float>
Type parameters: 'K, 'L, 'V

For each group with equal keys at the level specified by level, returns the sum of the values in the group. The function skips over missing values and NaN values. When there are no available values, the result is 0. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.levelVariance level series
Signature: (level:('K -> 'L)) -> series:Series<'K,'V> -> Series<'L,float>
Type parameters: 'K, 'L, 'V

For each group with equal keys at the level specified by level, returns the variance of the values in the group. The function skips over missing values and NaN values. When there are less than 2 values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Series interoploation 

Static members

Static memberDescription
Stats.interpolate keys f series
Signature: keys:seq<'K> -> (f:('K -> ('K * 'T) option -> ('K * 'T) option -> 'T)) -> series:Series<'K,'T> -> Series<'K,'T>
Type parameters: 'K, 'T

Interpolates an ordered series given a new sequence of keys. The function iterates through each new key, and invokes a function on the current key, the nearest smaller and larger valid observations from the series argument. The function must return a new valid float.

Parameters

  • keys - Sequence of new keys that forms the index of interpolated results
  • f - Function to do the interpolating
Stats.interpolateLinear(...)
Signature: keys:seq<'K> -> (keyDiff:('K -> 'K -> float)) -> series:Series<'K,'V> -> Series<'K,float>
Type parameters: 'K, 'V

Linearly interpolates an ordered series given a new sequence of keys.

Parameters

  • keys - Sequence of new keys that forms the index of interpolated results
  • keyDiff - A function representing "subtraction" between two keys Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Series statistics 

Functions such as count, mean, kurt etc. return the statistics calculated over all values of a series. The calculation skips over missing values (or nan values), so for example mean returns the average of all present values.

Static members

Static memberDescription
Stats.count(series)
Signature: series:Series<'K,'V> -> int
Type parameters: 'K, 'V

Returns the number of the values in a series. This excludes missing values and values created from Double.NaN etc.

Stats.describe(series)
Signature: series:Series<'K,'V> -> Series<string,float>
Type parameters: 'K, 'V

Returns the series of main statistic values of the series. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.kurt(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the kurtosis of the values in a series. The function skips over missing values and NaN values. When there are less than 4 values, the result is NaN.

Stats.max(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the maximum of the values in a series. The result is an float value. When the series contains no values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.maxBy f series
Signature: (f:('T -> 'a)) -> series:Series<'K,'T> -> ('K * 'T) option
Type parameters: 'T, 'a, 'K

Returns the key and value of the greatest element in the series. The result is an optional value. When the series contains no values, the result is None.

Stats.mean(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the mean of the values in a series. The function skips over missing values and NaN values. When there are no available values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.median(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the median of the elements of the series. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.min(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the minimum of the values in a series. The result is an float value. When the series contains no values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.minBy f series
Signature: (f:('T -> 'a)) -> series:Series<'K,'T> -> ('K * 'T) option
Type parameters: 'T, 'a, 'K

Returns the key and value of the least element in the series. The result is an optional value. When the series contains no values, the result is None.

Stats.numSum(series)
Signature: series:Series<'K,^V> -> ^V
Type parameters: 'K, ^V

Sum that operates only any appropriate numeric type. When there are no available values, the result is zero of the approriate numeric type.

Stats.quantile(quantiles, series)
Signature: (quantiles:float [] * series:Series<'K,'V>) -> Series<string,float>
Type parameters: 'K, 'V

Returns the series of quantiles of the series. Excel version of quantile, equivalent to QuantileDefinition.R7 from Math.Net Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.skew(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the skewness of the values in a series. The function skips over missing values and NaN values. When there are less than 3 values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.stdDev(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the standard deviation of the values in a series. The function skips over missing values and NaN values. When there are less than 2 values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.sum(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the sum of the values in a series. The function skips over missing values and NaN values. When there are no available values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Stats.tryMax(series)
Signature: series:Series<'K,'V> -> 'V option
Type parameters: 'K, 'V

Returns the maximum of the values in a series. The result is an option value. When the series contains no values, the result is None.

Stats.tryMin(series)
Signature: series:Series<'K,'V> -> 'V option
Type parameters: 'K, 'V

Returns the minimum of the values in a series. The result is an option value. When the series contains no values, the result is None.

Stats.variance(series)
Signature: series:Series<'K,'V> -> float
Type parameters: 'K, 'V

Returns the variance of the values in a series. The function skips over missing values and NaN values. When there are less than 2 values, the result is NaN. Throws a FormatException or an InvalidCastException if the value type of the series is not convertible to floating point number.

Other type members 

Static members

Static memberDescription
Stats.uniqueCount(frame)
Signature: frame:Frame<'R,'C> -> Series<'C,int>
Type parameters: 'R, 'C

For each column, returns the number of unique values.

Stats.uniqueCount(series)
Signature: series:Series<'K,'V> -> int
Type parameters: 'K, 'V

Returns the number of unique values in a series.

Fork me on GitHub