Covariance

Binder Notebook

Summary: This tutorial explains how to investigate the covariance of two samples with FSharp.Stats

Lets first define some sample data:

open Plotly.NET
open FSharp.Stats

let rnd = System.Random()
let error() = rnd.Next(11)

let sampleA = Vector.init 50 (fun x -> float x)
let sampleB = Vector.init 50 (fun x -> float (x + error()))
let sampleBHigh = sampleB |> Vector.map (fun x -> 200. + x)
let sampleC = Vector.init 50 (fun x -> 100. - float (x + 3 * error()))
let sampleD = Vector.init 50 (fun x -> 100. + float (10 * error()))


let sampleChart =
    [
        Chart.Point(sampleA,sampleB,"AB")
        Chart.Point(sampleA,sampleC,"AC")
        Chart.Point(sampleA,sampleD,"AD")  
        Chart.Point(sampleA,sampleBHigh,"AB+")   
    ]
    |> Chart.combine
    |> Chart.withTemplate ChartTemplates.lightMirrored
    |> Chart.withXAxisStyle "x"
    |> Chart.withYAxisStyle "y"
    |> Chart.withTitle "test cases for covariance calculation"

The covariance of two samples describes the relationship of both variables. If one variable tends to be high if its pair is high also, the covariance is positive. If on variable is low while its pair is high the covariance is negative. If there is no (monotone) relationship between both variables, the covariance is zero.

A positive covariance indicates a positive slope of a regression line, while a negative covariance indicates a negative slope. If the total population is given the covPopulation without Bessel's correction can be calculated.

\(\operatorname{cov}(X, Y) = \operatorname{E}{\big[(X - \operatorname{E}[X])(Y - \operatorname{E}[Y])\big]}\)

Note: The amplitude of covariance does not correlate with the slope, neither it correlates with the spread of the data points from the regression line.

A standardized measure for how well the data lie on the regression line is given by correlation analysis. The pearson correlation coefficient is defined as

\(\rho_{X,Y}= \frac{\operatorname{cov}(X,Y)}{\sigma_X \sigma_Y}\)

References:

  • Fahrmeir L et al., Statistik - Der Weg zur Datenanalyse, 8. Auflage, doi 10.1007/978-3-662-50372-0

cov and covPopulation are available as sequence (and other collections) extensions:

let covAB     = Vector.cov sampleA sampleB
let covAC     = Vector.cov sampleA sampleC
let covAD     = Vector.cov sampleA sampleD
let covABHigh = Vector.cov sampleA sampleBHigh

let covPopAB     = Vector.covPopulation sampleA sampleB
let covPopAC     = Vector.covPopulation sampleA sampleC
let covPopAD     = Vector.covPopulation sampleA sampleD
let covPopABHigh = Vector.covPopulation sampleA sampleBHigh

open Correlation
let pearsonAB     = Seq.pearson sampleA sampleB
let pearsonAC     = Seq.pearson sampleA sampleC
let pearsonAD     = Seq.pearson sampleA sampleD
let pearsonABHigh = Seq.pearson sampleA sampleBHigh
"Covariance of the presented four test cases
AB (blue)   cov: 217.94    covPopulation: 213.58   pearson: 0.979
AC (orange) cov: -176.9    covPopulation: -173.4   pearson: -0.80
AD (green)  cov: -46.84    covPopulation: -45.90   pearson: -0.100
AB+(red)    cov: 217.94    covPopulation: 213.58   pearson: 0.979"
namespace Plotly
namespace Plotly.NET
module Defaults from Plotly.NET
<summary> Contains mutable global default values. Changing these values will apply the default values to all consecutive Chart generations. </summary>
val mutable DefaultDisplayOptions: DisplayOptions
Multiple items
type DisplayOptions = inherit DynamicObj new: unit -> DisplayOptions static member addAdditionalHeadTags: additionalHeadTags: XmlNode list -> (DisplayOptions -> DisplayOptions) static member addDescription: description: XmlNode list -> (DisplayOptions -> DisplayOptions) static member combine: first: DisplayOptions -> second: DisplayOptions -> DisplayOptions static member getAdditionalHeadTags: displayOpts: DisplayOptions -> XmlNode list static member getDescription: displayOpts: DisplayOptions -> XmlNode list static member getPlotlyReference: displayOpts: DisplayOptions -> PlotlyJSReference static member init: ?AdditionalHeadTags: XmlNode list * ?Description: XmlNode list * ?PlotlyJSReference: PlotlyJSReference -> DisplayOptions static member initCDNOnly: unit -> DisplayOptions ...

--------------------
new: unit -> DisplayOptions
static member DisplayOptions.init: ?AdditionalHeadTags: Giraffe.ViewEngine.HtmlElements.XmlNode list * ?Description: Giraffe.ViewEngine.HtmlElements.XmlNode list * ?PlotlyJSReference: PlotlyJSReference -> DisplayOptions
type PlotlyJSReference = | CDN of string | Full | Require of string | NoReference
<summary> Sets how plotly is referenced in the head of html docs. </summary>
union case PlotlyJSReference.NoReference: PlotlyJSReference
Multiple items
namespace FSharp

--------------------
namespace Microsoft.FSharp
namespace FSharp.Stats
val rnd: System.Random
namespace System
Multiple items
type Random = new: unit -> unit + 1 overload member Next: unit -> int + 2 overloads member NextBytes: buffer: byte[] -> unit + 1 overload member NextDouble: unit -> float member NextInt64: unit -> int64 + 2 overloads member NextSingle: unit -> float32 static member Shared: Random
<summary>Represents a pseudo-random number generator, which is an algorithm that produces a sequence of numbers that meet certain statistical requirements for randomness.</summary>

--------------------
System.Random() : System.Random
System.Random(Seed: int) : System.Random
val error: unit -> int
System.Random.Next() : int
System.Random.Next(maxValue: int) : int
System.Random.Next(minValue: int, maxValue: int) : int
val sampleA: Vector<float>
Multiple items
module Vector from FSharp.Stats

--------------------
type Vector<'T> = interface IEnumerable interface IEnumerable<'T> interface IStructuralEquatable interface IStructuralComparable interface IComparable new: opsV: INumeric<'T> option * arrV: 'T array -> Vector<'T> override Equals: yobj: obj -> bool override GetHashCode: unit -> int member GetSlice: start: int option * finish: int option -> Vector<'T> member Permute: p: permutation -> Vector<'T> ...

--------------------
new: opsV: INumeric<'T> option * arrV: 'T array -> Vector<'T>
val init: count: int -> initializer: (int -> float) -> Vector<float>
<summary>Initiates vector of length count and fills it by applying initializer function on indices</summary>
<remarks></remarks>
<param name="count"></param>
<param name="initializer"></param>
<returns></returns>
<example><code></code></example>
val x: int
Multiple items
val float: value: 'T -> float (requires member op_Explicit)
<summary>Converts the argument to 64-bit float. This is a direct conversion for all primitive numeric types. For strings, the input is converted using <c>Double.Parse()</c> with InvariantCulture settings. Otherwise the operation requires an appropriate static conversion method on the input type.</summary>
<param name="value">The input value.</param>
<returns>The converted float</returns>
<example id="float-example"><code lang="fsharp"></code></example>


--------------------
[<Struct>] type float = System.Double
<summary>An abbreviation for the CLI type <see cref="T:System.Double" />.</summary>
<category>Basic Types</category>


--------------------
type float<'Measure> = float
<summary>The type of double-precision floating point numbers, annotated with a unit of measure. The unit of measure is erased in compiled code and when values of this type are analyzed using reflection. The type is representationally equivalent to <see cref="T:System.Double" />.</summary>
<category index="6">Basic Types with Units of Measure</category>
val sampleB: Vector<float>
val sampleBHigh: Vector<float>
val map: mapping: (float -> float) -> vector: vector -> Vector<float>
<summary>Builds a new vector whose elements are the results of applying the given function to each of the elements of the vector.</summary>
<remarks></remarks>
<param name="mapping"></param>
<param name="vector"></param>
<returns></returns>
<example><code></code></example>
val x: float
val sampleC: Vector<float>
val sampleD: Vector<float>
val sampleChart: GenericChart.GenericChart
type Chart = static member AnnotatedHeatmap: zData: seq<#seq<'a1>> * annotationText: seq<#seq<string>> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?X: seq<'a3> * ?MultiX: seq<seq<'a3>> * ?XGap: int * ?Y: seq<'a4> * ?MultiY: seq<seq<'a4>> * ?YGap: int * ?Text: 'a5 * ?MultiText: seq<'a5> * ?ColorBar: ColorBar * ?ColorScale: Colorscale * ?ShowScale: bool * ?ReverseScale: bool * ?ZSmooth: SmoothAlg * ?Transpose: bool * ?UseWebGL: bool * ?ReverseYAxis: bool * ?UseDefaults: bool -> GenericChart (requires 'a1 :> IConvertible and 'a3 :> IConvertible and 'a4 :> IConvertible and 'a5 :> IConvertible) + 1 overload static member Area: x: seq<#IConvertible> * y: seq<#IConvertible> * ?ShowMarkers: bool * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?MultiOpacity: seq<float> * ?Text: 'a2 * ?MultiText: seq<'a2> * ?TextPosition: TextPosition * ?MultiTextPosition: seq<TextPosition> * ?MarkerColor: Color * ?MarkerColorScale: Colorscale * ?MarkerOutline: Line * ?MarkerSymbol: MarkerSymbol * ?MultiMarkerSymbol: seq<MarkerSymbol> * ?Marker: Marker * ?LineColor: Color * ?LineColorScale: Colorscale * ?LineWidth: float * ?LineDash: DrawingStyle * ?Line: Line * ?AlignmentGroup: string * ?OffsetGroup: string * ?StackGroup: string * ?Orientation: Orientation * ?GroupNorm: GroupNorm * ?FillColor: Color * ?FillPatternShape: PatternShape * ?FillPattern: Pattern * ?UseWebGL: bool * ?UseDefaults: bool -> GenericChart (requires 'a2 :> IConvertible) + 1 overload static member Bar: values: seq<#IConvertible> * ?Keys: seq<'a1> * ?MultiKeys: seq<seq<'a1>> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?MultiOpacity: seq<float> * ?Text: 'a2 * ?MultiText: seq<'a2> * ?MarkerColor: Color * ?MarkerColorScale: Colorscale * ?MarkerOutline: Line * ?MarkerPatternShape: PatternShape * ?MultiMarkerPatternShape: seq<PatternShape> * ?MarkerPattern: Pattern * ?Marker: Marker * ?Base: #IConvertible * ?Width: 'a4 * ?MultiWidth: seq<'a4> * ?TextPosition: TextPosition * ?MultiTextPosition: seq<TextPosition> * ?UseDefaults: bool -> GenericChart (requires 'a1 :> IConvertible and 'a2 :> IConvertible and 'a4 :> IConvertible) + 1 overload static member BoxPlot: ?X: seq<'a0> * ?MultiX: seq<seq<'a0>> * ?Y: seq<'a1> * ?MultiY: seq<seq<'a1>> * ?Name: string * ?ShowLegend: bool * ?Text: 'a2 * ?MultiText: seq<'a2> * ?FillColor: Color * ?MarkerColor: Color * ?Marker: Marker * ?Opacity: float * ?WhiskerWidth: float * ?BoxPoints: BoxPoints * ?BoxMean: BoxMean * ?Jitter: float * ?PointPos: float * ?Orientation: Orientation * ?OutlineColor: Color * ?OutlineWidth: float * ?Outline: Line * ?AlignmentGroup: string * ?OffsetGroup: string * ?Notched: bool * ?NotchWidth: float * ?QuartileMethod: QuartileMethod * ?UseDefaults: bool -> GenericChart (requires 'a0 :> IConvertible and 'a1 :> IConvertible and 'a2 :> IConvertible) + 2 overloads static member Bubble: x: seq<#IConvertible> * y: seq<#IConvertible> * sizes: seq<int> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?MultiOpacity: seq<float> * ?Text: 'a2 * ?MultiText: seq<'a2> * ?TextPosition: TextPosition * ?MultiTextPosition: seq<TextPosition> * ?MarkerColor: Color * ?MarkerColorScale: Colorscale * ?MarkerOutline: Line * ?MarkerSymbol: MarkerSymbol * ?MultiMarkerSymbol: seq<MarkerSymbol> * ?Marker: Marker * ?LineColor: Color * ?LineColorScale: Colorscale * ?LineWidth: float * ?LineDash: DrawingStyle * ?Line: Line * ?AlignmentGroup: string * ?OffsetGroup: string * ?StackGroup: string * ?Orientation: Orientation * ?GroupNorm: GroupNorm * ?UseWebGL: bool * ?UseDefaults: bool -> GenericChart (requires 'a2 :> IConvertible) + 1 overload static member Candlestick: ``open`` : seq<#IConvertible> * high: seq<#IConvertible> * low: seq<#IConvertible> * close: seq<#IConvertible> * ?X: seq<'a4> * ?MultiX: seq<seq<'a4>> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?Text: 'a5 * ?MultiText: seq<'a5> * ?Line: Line * ?IncreasingColor: Color * ?Increasing: FinanceMarker * ?DecreasingColor: Color * ?Decreasing: FinanceMarker * ?WhiskerWidth: float * ?ShowXAxisRangeSlider: bool * ?UseDefaults: bool -> GenericChart (requires 'a4 :> IConvertible and 'a5 :> IConvertible) + 2 overloads static member Column: values: seq<#IConvertible> * ?Keys: seq<'a1> * ?MultiKeys: seq<seq<'a1>> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?MultiOpacity: seq<float> * ?Text: 'a2 * ?MultiText: seq<'a2> * ?MarkerColor: Color * ?MarkerColorScale: Colorscale * ?MarkerOutline: Line * ?MarkerPatternShape: PatternShape * ?MultiMarkerPatternShape: seq<PatternShape> * ?MarkerPattern: Pattern * ?Marker: Marker * ?Base: #IConvertible * ?Width: 'a4 * ?MultiWidth: seq<'a4> * ?TextPosition: TextPosition * ?MultiTextPosition: seq<TextPosition> * ?UseDefaults: bool -> GenericChart (requires 'a1 :> IConvertible and 'a2 :> IConvertible and 'a4 :> IConvertible) + 1 overload static member Contour: zData: seq<#seq<'a1>> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?X: seq<'a2> * ?MultiX: seq<seq<'a2>> * ?Y: seq<'a3> * ?MultiY: seq<seq<'a3>> * ?Text: 'a4 * ?MultiText: seq<'a4> * ?ColorBar: ColorBar * ?ColorScale: Colorscale * ?ShowScale: bool * ?ReverseScale: bool * ?Transpose: bool * ?ContourLineColor: Color * ?ContourLineDash: DrawingStyle * ?ContourLineSmoothing: float * ?ContourLine: Line * ?ContoursColoring: ContourColoring * ?ContoursOperation: ConstraintOperation * ?ContoursType: ContourType * ?ShowContourLabels: bool * ?ContourLabelFont: Font * ?Contours: Contours * ?FillColor: Color * ?NContours: int * ?UseDefaults: bool -> GenericChart (requires 'a1 :> IConvertible and 'a2 :> IConvertible and 'a3 :> IConvertible and 'a4 :> IConvertible) static member Funnel: x: seq<#IConvertible> * y: seq<#IConvertible> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?Width: float * ?Offset: float * ?Text: 'a2 * ?MultiText: seq<'a2> * ?TextPosition: TextPosition * ?MultiTextPosition: seq<TextPosition> * ?Orientation: Orientation * ?AlignmentGroup: string * ?OffsetGroup: string * ?MarkerColor: Color * ?MarkerOutline: Line * ?Marker: Marker * ?TextInfo: TextInfo * ?ConnectorLineColor: Color * ?ConnectorLineStyle: DrawingStyle * ?ConnectorFillColor: Color * ?ConnectorLine: Line * ?Connector: FunnelConnector * ?InsideTextFont: Font * ?OutsideTextFont: Font * ?UseDefaults: bool -> GenericChart (requires 'a2 :> IConvertible) static member Heatmap: zData: seq<#seq<'a1>> * ?X: seq<'a2> * ?MultiX: seq<seq<'a2>> * ?Y: seq<'a3> * ?MultiY: seq<seq<'a3>> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?XGap: int * ?YGap: int * ?Text: 'a4 * ?MultiText: seq<'a4> * ?ColorBar: ColorBar * ?ColorScale: Colorscale * ?ShowScale: bool * ?ReverseScale: bool * ?ZSmooth: SmoothAlg * ?Transpose: bool * ?UseWebGL: bool * ?ReverseYAxis: bool * ?UseDefaults: bool -> GenericChart (requires 'a1 :> IConvertible and 'a2 :> IConvertible and 'a3 :> IConvertible and 'a4 :> IConvertible) + 1 overload ...
static member Chart.Point: xy: seq<#System.IConvertible * #System.IConvertible> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?MultiOpacity: seq<float> * ?Text: 'a2 * ?MultiText: seq<'a2> * ?TextPosition: StyleParam.TextPosition * ?MultiTextPosition: seq<StyleParam.TextPosition> * ?MarkerColor: Color * ?MarkerColorScale: StyleParam.Colorscale * ?MarkerOutline: Line * ?MarkerSymbol: StyleParam.MarkerSymbol * ?MultiMarkerSymbol: seq<StyleParam.MarkerSymbol> * ?Marker: TraceObjects.Marker * ?AlignmentGroup: string * ?OffsetGroup: string * ?StackGroup: string * ?Orientation: StyleParam.Orientation * ?GroupNorm: StyleParam.GroupNorm * ?UseWebGL: bool * ?UseDefaults: bool -> GenericChart.GenericChart (requires 'a2 :> System.IConvertible)
static member Chart.Point: x: seq<#System.IConvertible> * y: seq<#System.IConvertible> * ?Name: string * ?ShowLegend: bool * ?Opacity: float * ?MultiOpacity: seq<float> * ?Text: 'c * ?MultiText: seq<'c> * ?TextPosition: StyleParam.TextPosition * ?MultiTextPosition: seq<StyleParam.TextPosition> * ?MarkerColor: Color * ?MarkerColorScale: StyleParam.Colorscale * ?MarkerOutline: Line * ?MarkerSymbol: StyleParam.MarkerSymbol * ?MultiMarkerSymbol: seq<StyleParam.MarkerSymbol> * ?Marker: TraceObjects.Marker * ?AlignmentGroup: string * ?OffsetGroup: string * ?StackGroup: string * ?Orientation: StyleParam.Orientation * ?GroupNorm: StyleParam.GroupNorm * ?UseWebGL: bool * ?UseDefaults: bool -> GenericChart.GenericChart (requires 'c :> System.IConvertible)
static member Chart.combine: gCharts: seq<GenericChart.GenericChart> -> GenericChart.GenericChart
static member Chart.withTemplate: template: Template -> (GenericChart.GenericChart -> GenericChart.GenericChart)
module ChartTemplates from Plotly.NET
val lightMirrored: Template
static member Chart.withXAxisStyle: ?TitleText: string * ?TitleFont: Font * ?TitleStandoff: int * ?Title: Title * ?Color: Color * ?AxisType: StyleParam.AxisType * ?MinMax: (#System.IConvertible * #System.IConvertible) * ?Mirror: StyleParam.Mirror * ?ShowSpikes: bool * ?SpikeColor: Color * ?SpikeThickness: int * ?ShowLine: bool * ?LineColor: Color * ?ShowGrid: bool * ?GridColor: Color * ?GridDash: StyleParam.DrawingStyle * ?ZeroLine: bool * ?ZeroLineColor: Color * ?Anchor: StyleParam.LinearAxisId * ?Side: StyleParam.Side * ?Overlaying: StyleParam.LinearAxisId * ?Domain: (float * float) * ?Position: float * ?CategoryOrder: StyleParam.CategoryOrder * ?CategoryArray: seq<#System.IConvertible> * ?RangeSlider: LayoutObjects.RangeSlider * ?RangeSelector: LayoutObjects.RangeSelector * ?BackgroundColor: Color * ?ShowBackground: bool * ?Id: StyleParam.SubPlotId -> (GenericChart.GenericChart -> GenericChart.GenericChart)
static member Chart.withYAxisStyle: ?TitleText: string * ?TitleFont: Font * ?TitleStandoff: int * ?Title: Title * ?Color: Color * ?AxisType: StyleParam.AxisType * ?MinMax: (#System.IConvertible * #System.IConvertible) * ?Mirror: StyleParam.Mirror * ?ShowSpikes: bool * ?SpikeColor: Color * ?SpikeThickness: int * ?ShowLine: bool * ?LineColor: Color * ?ShowGrid: bool * ?GridColor: Color * ?GridDash: StyleParam.DrawingStyle * ?ZeroLine: bool * ?ZeroLineColor: Color * ?Anchor: StyleParam.LinearAxisId * ?Side: StyleParam.Side * ?Overlaying: StyleParam.LinearAxisId * ?AutoShift: bool * ?Shift: int * ?Domain: (float * float) * ?Position: float * ?CategoryOrder: StyleParam.CategoryOrder * ?CategoryArray: seq<#System.IConvertible> * ?RangeSlider: LayoutObjects.RangeSlider * ?RangeSelector: LayoutObjects.RangeSelector * ?BackgroundColor: Color * ?ShowBackground: bool * ?Id: StyleParam.SubPlotId -> (GenericChart.GenericChart -> GenericChart.GenericChart)
static member Chart.withTitle: title: Title -> (GenericChart.GenericChart -> GenericChart.GenericChart)
static member Chart.withTitle: title: string * ?TitleFont: Font -> (GenericChart.GenericChart -> GenericChart.GenericChart)
val covAB: float
val cov: v1: vector -> v2: vector -> float
<summary>Returns the sample covariance of two random variables v1 and v2. (Bessel's correction by N-1) </summary>
<remarks></remarks>
<param name="v1"></param>
<param name="v2"></param>
<returns></returns>
<example><code></code></example>
val covAC: float
val covAD: float
val covABHigh: float
val covPopAB: float
val covPopulation: v1: vector -> v2: vector -> float
<summary>Returns an estimator of the population covariance of two random variables v1 and v2 </summary>
<remarks></remarks>
<param name="v1"></param>
<param name="v2"></param>
<returns></returns>
<example><code></code></example>
val covPopAC: float
val covPopAD: float
val covPopABHigh: float
module Correlation from FSharp.Stats
<summary> Contains correlation functions for different data types </summary>
val pearsonAB: float
Multiple items
module Seq from FSharp.Stats.Correlation
<summary> Contains correlation functions optimized for sequences </summary>

--------------------
module Seq from FSharp.Stats
<summary> Module to compute common statistical measure </summary>

--------------------
module Seq from Microsoft.FSharp.Collections
<summary>Contains operations for working with values of type <see cref="T:Microsoft.FSharp.Collections.seq`1" />.</summary>

--------------------
type Seq = new: unit -> Seq static member geomspace: start: float * stop: float * num: int * ?IncludeEndpoint: bool -> seq<float> static member linspace: start: float * stop: float * num: int * ?IncludeEndpoint: bool -> seq<float>

--------------------
new: unit -> Seq
val pearson: seq1: seq<'T> -> seq2: seq<'T> -> float (requires member op_Explicit and member get_Zero and member get_One)
<summary>Calculates the pearson correlation of two samples. Homoscedasticity must be assumed.</summary>
<remarks></remarks>
<param name="seq1"></param>
<param name="seq2"></param>
<returns></returns>
<example><code></code></example>
val pearsonAC: float
val pearsonAD: float
val pearsonABHigh: float
module GenericChart from Plotly.NET
<summary> Module to represent a GenericChart </summary>
val toChartHTML: gChart: GenericChart.GenericChart -> string
val covs: string
val sprintf: format: Printf.StringFormat<'T> -> 'T
<summary>Print to a string using the given format.</summary>
<param name="format">The formatter.</param>
<returns>The formatted result.</returns>
<example>See <c>Printf.sprintf</c> (link: <see cref="M:Microsoft.FSharp.Core.PrintfModule.PrintFormatToStringThen``1" />) for examples.</example>