Data science packages

Use these packages to fuel your data science journey in F# and .NET! 🚀

Want to add a package to the curated list? File a PR

The FSharp.Data package contains type providers and utilities to access common data formats (CSV, HTML, JSON and XML in your F# applications and scripts. It also contains helpers for parsing CSV, HTML and JSON files and for sending HTTP requests.

#r "nuget: FSharp.Data"
data access type providers http
Read More

FSharp.Data is a multipurpose project for data access from many different file formats. Most of this is done via type providers.

We recommend using the Http module provided by FSharp.data to download data sources via Http and then convert them to deedle data frames via Frame.ReadCsvString:

#r "nuget: FSharp.Data"
#r "nuget: Deedle"

open FSharp.Data
open Deedle

let rawData = Http.RequestString @"https://raw.githubusercontent.com/dotnet/machinelearning/master/test/data/housing.txt"

// get a frame containing the values of houses at the charles river only
let df = 
    Frame.ReadCsvString(rawData, separators="\t")
    |> Frame.sliceCols ["MedianHomeValue"; "CharlesRiver"]
    |> Frame.filterRowValues (fun s -> s.GetAs<bool>("CharlesRiver"))

df.Print()

statistical testing, linear algebra, machine learning, fitting and signal processing in F#.

#r "nuget: FSharp.Stats"
statistics linear algebra machine learning fitting signal processing
Read More

FSharp.Stats is a multipurpose project for statistical testing, linear algebra, machine learning, fitting and signal processing.

Here is a simple basic example for getting general statistics of a sequence of numbers sampled from a normal distribution:

#r "nuget: FSharp.Stats"
open FSharp.Stats

// initialize a normal distribution with mean 25 and standard deviation 0.1
let normalDistribution = Distributions.Continuous.normal 25. 0.1

// draw independently 30 times from the given distribution 
let sample = Array.init 30 (fun _ -> normalDistribution.Sample())

let mean = Seq.mean sample
let stDev = Seq.stDev sample
let cv = Seq.cv sample

Deedle is an easy to use library for data and time series manipulation and for scientific programming.

#r "nuget: Deedle"
dataframe data exploration data access
Read More

Deedle implements efficient and robust frame and series data structures for accessing and manipulating structured data.

It supports handling of missing values, aggregations, grouping, joining, statistical functions and more. For frames and series with ordered indices (such as time series), automatic alignment is also available.

Here is a short snippet on how to read and manipulate an online data source (HTTP requests are done with FSharp.Data).

It reads the boston housing data set csv file from an online data source and

#r "nuget: FSharp.Data"
#r "nuget: Deedle"

open FSharp.Data
open Deedle

let rawData = Http.RequestString @"https://raw.githubusercontent.com/dotnet/machinelearning/master/test/data/housing.txt"

// get a frame containing the values of houses at the charles river only
let df = 
    Frame.ReadCsvString(rawData, separators="\t")
    |> Frame.sliceCols ["MedianHomeValue"; "CharlesRiver"]
    |> Frame.filterRowValues (fun s -> s.GetAs<bool>("CharlesRiver"))

df.Print()

An F# Type Provider providing strongly typed access to the R statistical language. The type provider automatically discovers available R packages and makes them easily accessible from F#, so you can easily call powerful packages and visualization libraries from code running on the .NET platform.

#r "nuget: RProvider"
R interop type provider visualisation statistics
Read More

The R Provider discovers R packages that are available in your R installation and makes them available as .NET namespaces underneath the parent namespace RProvider. For example, the stats package is available as RProvider.stats. If you open the namespaces you want to use, functions and values will be available as R.name.

There are three requirements to be set up on your system:

  • dotnet SDK 5.0 or greater; and
  • R statistical language. Note: on Windows, there is currently a bug in R preventing us from supporting R versions greater than 4.0.2.
  • R_HOME environment variable set to the R home directory. This can usually be identified by running the command 'R RHOME'.

Below is a simple script example that demonstrates using R statistical functions, and using R graphics functions to create charts:

#r "nuget: RProvider,2.0.1"

open RProvider
open RProvider.graphics
open RProvider.grDevices
open RProvider.datasets

// use R to calculate the mean of a list
R.mean([1;2;3;4])

// Calculate sin using the R 'sin' function
// (converting results to 'float') and plot it
[ for x in 0.0 .. 0.1 .. 3.14 -> 
    R.sin(x).GetValue<float>() ]
|> R.plot

// Plot the data from the standard 'Nile' data set
R.plot(R.Nile)

Plotly.NET provides functions for generating and rendering plotly.js charts in .NET programming languages 📈🚀.

#r "nuget: Plotly.NET"
visualization data exploration charting
Read More

Plotly.NET provides functions for generating and rendering plotly.js charts in .NET programming languages 📈🚀.

It can be used to create plotly.js charts in the following environments:



1. interactive charts in html pages

2. interactive charts in dotnet interactive notebooks via the Plotly.NET.Interactive package

3. static chart images via Plotly.NET.ImageExport package (see more here)



here is a basic example snippet that renders a simple point chart, either as html page or static image:

#r "nuget: Plotly.NET, 2.0.0-preview.16"
#r "nuget: Plotly.NET.ImageExport, 2.0.0-preview.16"

open Plotly.NET

let myChart = 
    Chart.Point(
        [
            (1.,2.)
            (2.,3.)
            (3.,4.)
            (5.,2.)
        ]
    )
    |> Chart.withTitle "Hello from Plotly.NET!"
    |> Chart.withX_AxisStyle ("X-Axis",Showline=true,Showgrid=true)
    |> Chart.withY_AxisStyle ("Y-Axis",Showline=true,Showgrid=true)

myChart |> Chart.Show //display as html in browser 

//using tatic image export
open Plotly.NET.ImageExport

myChart |> Chart.savePNG("myChart",Width=600,Height=600)  //save chart as static png with 600x600 px

Here is an image of the rendered chart:

.NET interface for Cytoscape.js written in F# for graph visualization.

#r "nuget: Cyjs.NET"
cytoscape graph visualization
Read More

This package provides a light-weighted layer on top of the famous Cytoscape.js library. It brings all the graph visualization capabilities directly into .NET.

Here is a small snippet that creates a basic styled graph:

#r "nuget: Cyjs.NET"

open Cyjs.NET
open Elements

let myFirstStyledGraph =     
    CyGraph.initEmpty ()
    |> CyGraph.withElements [
            node "n1" [ CyParam.label "FsLab"  ]
            node "n2" [ CyParam.label "ML" ]
 
            edge  "e1" "n1" "n2" []
        ]
    |> CyGraph.withStyle "node"     
            [
                CyParam.content =. CyParam.label
                CyParam.color "#A00975"
            ]
    |> CyGraph.withSize(800, 400)  
    |> CyGraph.show // displays the graph in a browser

Here is an image of the rendered graph:

Fsharp LInear Programming System.

#r "nuget: flips"
optimization linear programming
Read More

Flips is an F# library for modeling and solving Linear Programming (LP) and Mixed-Integer Programming (MIP) problems. It is inspired by the work of the PuLP library for Python and the excellent Gurobi Python library. It builds on the work of the outstanding Google OR-Tools library and the OPTANO library .