Working with data frames in F#
In this section, we look at various features of the F# data frame library (using both
Series
and Frame
types and modules). Feel free to jump to the section you are interested
in, but note that some sections refer back to values built in "Creating & loading".
You can also get this page as an F# script file from GitHub and run the samples interactively.
Creating frames & loading data
Loading and saving CSV files
The easiest way to get data into data frame is to use a CSV file. The Frame.ReadCsv
function exposes this functionality:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: |
|
In the second example, we call indexRowsDate
to use the "Date" column as a row index
of the resulting data frame. This is a very common scenario and so Deedle provides an
easier option using a generic overload of the ReadCsv
method:
1: 2: 3: |
|
The ReadCsv
method has a number of optional arguments that you can use to control
the loading. It supports both CSV files, TSV files and other formats. If the file name
ends with tsv
, the Tab is used automatically, but you can set separator
explicitly.
The following parameters can be used:
path
- Specifies a file name or an web location of the resource.-
indexCol
- Specifies the column that should be used as an index in the resulting frame. The type is specified via a type parameter. -
inferTypes
- Specifies whether the method should attempt to infer types of columns automatically (set this tofalse
if you want to specify schema) -
inferRows
- IfinferTypes=true
, this parameter specifies the number of rows to use for type inference. The default value is 100. Value 0 means all rows. -
schema
- A string that specifies CSV schema. See the documentation for information about the schema format. -
separators
- A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example";"
to parse semicolon separated files. -
culture
- Specifies the name of the culture that is used when parsing values in the CSV file (such as"en-US"
). The default is invariant culture.
The parameters are the same as those used by the CSV type provider in F# Data, so you can find additional documentation there.
Once you have a data frame, you can also save it to a CSV file using the
SaveCsv
method. For example:
1: 2: 3: 4: |
|
By default, the SaveCsv
method does not include the key from the data frame. This can be
overriden by calling SaveCsv
with the optional argument includeRowKeys=true
, or with an
additional argument keyNames
(demonstrated above) which sets the headers for the key columns(s)
in the CSV file. Usually, there is just a single row key, but there may be multiple when
hierarchical indexing is used.
Loading F# records or .NET objects
If you have another .NET or F# components that returns data as a sequence of F# records,
C# anonymous types or other .NET objects, you can use Frame.ofRecords
to turn them
into a data frame. Assume we have:
1: 2: 3: 4: 5: 6: 7: 8: |
|
Now we can easily create a data frame that contains three columns
(Name
, Age
and Countries
) containing data of the same type as
the properties of Person
:
1: 2: 3: 4: |
|
Note that this does not perform any conversion on the column data. Numerical series
can be accessed using the ?
operator. For other types, we need to explicitly call
GetColumn
with the right type arguments:
1: 2: |
|
F# Data providers
In general, you can use any data source that exposes data as series of tuples. This means that we can easily load data using, for example, the World Bank type provider from F# Data library.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: |
|
To make data manipulation more convenient, we read country information per region and create data frame with a hierarchical index (for more information, see the advanced indexing section). Now we can easily read data for OECD and Euro area:
1: 2: 3: 4: 5: 6: |
|
(Euro area, Austria) |
(Euro area, Belgium) |
(Euro area, Cyprus) |
... |
(OECD members, Sweden) |
(OECD members, Turkey) |
(OECD members, United States) |
|
---|---|---|---|---|---|---|---|
1960 |
12.46 |
14.45 |
N/A |
... |
76.79 |
543.3 |
|
1961 |
13.82 |
15.37 |
N/A |
... |
83.53 |
563.3 |
|
1962 |
14.66 |
16.44 |
N/A |
... |
90.59 |
605.1 |
|
1963 |
15.82 |
17.67 |
N/A |
... |
98.05 |
638.6 |
|
1964 |
17.33 |
19.78 |
N/A |
... |
109.35 |
685.8 |
|
1965 |
18.88 |
21.53 |
N/A |
... |
120.33 |
743.7 |
|
1966 |
20.57 |
23.12 |
N/A |
... |
130.89 |
815 |
|
1967 |
21.88 |
24.78 |
N/A |
... |
142.07 |
861.7 |
|
... |
... |
... |
... |
... |
... |
... |
... |
2015 |
344.26 |
411.1 |
17.75 |
... |
4201.54 |
2338.65 |
18219.3 |
2016 |
356.24 |
424.6 |
18.49 |
... |
4385.5 |
2608.53 |
18707.19 |
2017 |
369.9 |
439.17 |
19.65 |
... |
4578.83 |
3106.54 |
19485.39 |
2018 |
386.09 |
450.51 |
20.73 |
... |
4789.85 |
3700.99 |
20494.1 |
The loaded data look something like the sample above. As you can see, the columns are grouped by the region and some data are not available.
Expanding objects in columns
It is possible to create data frames that contain other .NET objects as members in a series. This might be useful, for example, when you get multiple data sources producing objects and you want to align or join them before working with them. However, working with frames that contain complex .NET objects is less conveninet.
For this reason, the data frame supports expansion. Given a data frame with some object
in a column, you can use Frame.expandCols
to create a new frame that contains properties
of the object as new columns. For example:
1: 2: 3: 4: 5: 6: |
|
People.Name |
People.Age |
People.Countries |
|
---|---|---|---|
0 |
Joe |
51 |
[UK; US; UK] |
1 |
Tomas |
28 |
[CZ; UK; US; ... ] |
2 |
Eve |
2 |
[FR] |
3 |
Suzanne |
15 |
[US] |
As you can see, the operation generates columns based on the properties of the original column type and generates new names by prefixing the property names with the name of the original column.
Aside from properties of .NET objects, the expansion can also handle values of type
IDictionary<K, V>
and series that contain nested series with string
keys
(i.e. Series<string, T>
). If you have more complex structure, you can use
Frame.expandAllCols
to expand columns to a specified level recursively:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
Here, the resulting data frame will have 4 columns including
Tuples.A
and Tuples.B
(for the first keys) and Tuples.C.Item1
together with Tuples.C.Item2
representing the two items of the tuple
nested in a dictionary.
Manipulating data frames
The series type Series<K, V>
represents a series with keys of type K
and values
of type V
. This means that when working with series, the type of values is known
statically. When working with data frames, this is not the case - a frame is represented
as Frame<R, C>
where R
and C
are the types of row and column indices, respectively
(typically, R
will be an int
or DateTime
and C
will be string
representing
different column/series names.
A frame can contain heterogeneous data. One column may contain integers, another may contain floating point values and yet another can contain strings, dates or other objects like lists of strings. This information is not captured statically - and so when working with frames, you may need to specify the type explicitly, for example, when reading a series from a frame.
Getting data from a frame
We'll use the data frame people
which contains three columns - Name
of type string
,
Age
of type int
and Countries
of type string list
(we created it from F# records
in the previous section):
Age |
Countries |
AgePlusOne |
Siblings |
|
---|---|---|---|---|
Joe |
51 |
[UK; US; UK] |
52 |
3 |
Tomas |
28 |
[CZ; UK; US; ... ] |
29 |
2 |
Eve |
2 |
[FR] |
3 |
1 |
Suzanne |
15 |
[US] |
16 |
0 |
To get a column (series) from a frame df
, you can use operations that are exposed directly
by the data frame, or you can use df.Columns
which returns all columns of the frame as a
series of series.
1: 2: 3: 4: 5: 6: 7: |
|
A series s
of type Series<string, V>
supports the question mark operator s?Foo
to get
a value of type V
associated with the key Foo
. For other key types, you can sue the Get
method. Note that, unlike with frames, there is no implicit conversion:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
The question mark operator and Get
method can be used on the Columns
property of data frame.
The return type of df?Columns
is ColumnSeries<string, string>
which is just a thin wrapper
over Series<C, ObjectSeries<R>>
. This means that you get back a series indexed by column names
where the values are ObjectSeries<R>
representing individual columns. The type
ObjectSeries<R>
is a thin wrapper over Series<R, obj>
which adds several functions
for getting the values as values of specified type.
In our case, the returned values are individual columns represented as ObjectSeries<string>
:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: |
|
The type ObjectSeries<string>
has a few methods in addition to ordinary Series<K, V>
type.
On the lines 18 and 20, we use As<T>
and TryAs<T>
that can be used to convert object series
to a series with statically known type of values. The expression on line 18 is equivalent to
people.GetColumn<obj>("Age")
, but it is not specific to frame columns - you can use the
same approach to work with frame rows (using people.Rows
) if your data set has rows of
homogeneous types.
Another case where you'll need to work with ObjectSeries<T>
is when mapping over rows:
1: 2: 3: |
|
The rows that you get as a result of people.Rows
are heterogeneous (they contain values
of different types), so we cannot use row.As<T>()
to convert all values of the series
to some type. Instead, we use GetAs<T>(...)
which is similar to Get(...)
but converts
the value to a given type. You could also achieve the same thing by writing row?Countries
and then casting the result to string list
, but the GetAs
method provides a more convenient
syntax.
Typed access to rows
Accessing columns using ObjectSeries<T>
is fine for simple tasks, but it has two problems.
First, it is not type-safe and you can easily get a runtime exception if you specify wrong
type. Second, it involves boxing and unboxing and so it may be inefficient.
To address these two issues, Deedle provides another alternative. You can specify an interface that defines the types of columns once and then use this interface to get a series of rows where every row is an instance of the interface:
1: 2: 3: 4: 5: 6: 7: 8: |
|
You still need to be careful and define the types in the IPerson
interface correctly, but
once the GetRowsAs<IPerson>
call returns a value, you will be able to access the rows in
a nice typed way. Alternatively, you can also specify the type with OptionalValue<T>
, in
case you want to explicitly handle missing values.
1: 2: 3: 4: |
|
Adding rows and columns
The series type is immutable and so it is not possible to add new values to a series or
change the values stored in an existing series. However, you can use operations that return
a new series as the result such as Merge
.
1: 2: 3: 4: |
|
Data frame allows a very limited form of mutation. It is possible to add new series (as a column) to an existing data frame, drop a series or replace a series. However, individual series are still immutable.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: |
|
Finally, it is also possible to append one data frame or a single row to an existing data
frame. The operation is immutable, so the result is a new data frame with the added
rows. To create a new row for the data frame, we can use standard ways of constructing
series from key-value pairs, or we can use the SeriesBuilder
type:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: |
|
Advanced slicing and lookup
Given a series, we have a number of options for getting one or more values or observations (keys and an associated values) from the series. First, let's look at different lookup operations that are available on any (even unordered series).
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
For more examples, we use the Age
column from earlier data set as example:
1: 2: 3: 4: 5: 6: 7: |
|
The Series
module provides another set of useful functions (many of those
are also available as members, for example via ages.TryGet
):
1: 2: 3: 4: 5: 6: 7: |
|
We can also obtain all data from the series. The data frame library uses the term observations for all key-value pairs
1: 2: 3: 4: 5: 6: |
|
The previous examples were always looking for an exact key. If we have an ordered series, we can search for a nearest available key and we can also perform slicing. We use MSFT stock prices from earlier example:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
When using instance members, we can use Get
which has an overload taking
Lookup
. The same functionality is exposed using Series.lookup
. We can
also obtain values for a sequence of keys:
1: 2: 3: 4: 5: 6: |
|
With ordered series, we can use slicing to get a sub-range of a series:
1: 2: |
|
Keys |
1/2/2013 |
1/3/2013 |
1/4/2013 |
1/7/2013 |
1/8/2013 |
... |
1/29/2013 |
1/30/2013 |
1/31/2013 |
---|---|---|---|---|---|---|---|---|---|
Values |
27.25 |
27.63 |
27.27 |
26.77 |
26.75 |
... |
27.82 |
28.01 |
27.79 |
The slicing works even if the keys are not available in the series. The lookup automatically uses nearest greater lower bound and nearest smaller upper bound (here, we have no value for January 1).
Several other options - discussed in a later section - are available when using hierarchical (or multi-level) indices. But first, we need to look at grouping.
Grouping data
Grouping of data can be performed on both unordered and ordered series and frames. For ordered series, more options (such as floating window or grouping of consecutive elements) are available - these can be found in the time series tutorial. There are essentially two options:
-
You can group series of any values and get a series of series (representing individual groups). The result can easily be turned into a data frame using
Frame.ofColumns
orFrame.ofRows
, but this is not done automatically. -
You can group a frame rows using values in a specified column, or using a function. The result is a frame with multi-level (hierarchical) index. Hierarchical indexing is discussed later.
Keep in mind that you can easily get a series of rows or a series of columns from a frame
using df.Rows
and df.Columns
, so the first option is also useful on data frames.
Grouping series
In the following sample, we use the data frame people
loaded from F# records in
an earlier section. Let's first get the data:
1: 2: 3: 4: 5: 6: |
|
Now we can group the elements using both key (e.g. length of a name) and using the value (e.g. the number of visited countries):
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
The groupBy
function returns a series of series (series with new keys, containing
series with all values for a given new key). You can than transform the values using
Series.mapValues
. However, if you want to avoid allocating all intermediate series,
you can also use Series.groupInto
which takes projection function as a second argument.
In the above examples, we count the number of keys in each group.
As a final example, let's say that we want to build a data frame that contains individual people (as rows), all countries that appear in someone's travel list (as columns). The frame contains the number of visits to each country by each person:
1: 2: 3: 4: |
|
UK |
US |
CZ |
FR |
|
---|---|---|---|---|
Joe |
2 |
1 |
0 |
0 |
Tomas |
1 |
1 |
2 |
0 |
Eve |
0 |
0 |
0 |
1 |
Suzanne |
0 |
1 |
0 |
0 |
The problem can be solved just using Series.mapValues
, together with standard F#
Seq
functions. We iterate over all rows (people and their countries). For each
country list, we generate a series that contains individual countries and the count
of visits (this is done by composing Seq.countBy
and a function series
to build
a series of observations). Then we turn the result to a data frame and fill missing
values with the constant zero (see a section about handling missing values).
Grouping data frames
So far, we worked with series and series of series (which can be turned into data frames
using Frame.ofRows
and Frame.ofColumns
). Next, we look at working with data frames.
Assume we loaded Titanic data set that is also used on the project home page. First, let's look at basic grouping (also used in the home page demo):
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: |
|
When working with frames, you can group data using both rows and columns. For most
functions there is groupRows
and groupCols
equivalent.
The easiest functions to use are Frame.groupRowsByXyz
where Xyz
specifies the
type of the column that we're using for grouping. For example, we can easily group
rows using the "Sex" column.
When using less common type, you need to specify the type of the column. You can
see this on lines 5 and 9 where we use decimal
as the key. Finally, you can also
specify key selector as a function. The function gets the original key and the row
as a value of ObjectSeries<K>
. The type has various members for getting individual
values (columns) such as GetAs
which allows us to get a column of a specified type.
Grouping by single key
A grouped data frame uses multi-level index. This means that the index is a tuple of keys that represent multiple levels. For example:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
As you can see, the pretty printer understands multi-level indices and
outputs the first level (sex) followed by the second level (passanger id).
You can turn frame with two-level index into a series of data frames
(and vice versa) using Frame.unnest
and Frame.nest
:
1: 2: 3: 4: 5: |
|
Grouping by multiple keys
Finally, we can also apply grouping operation repeatedly to group data using multiple keys (and get a frame indexed by more than 2 levels). For example, we can group passangers by their class and port where they embarked:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
If you look at the type of byClassAndPort
, you can see that it is
Frame<(string * int * int),string>
. The row key is a tripple consisting
of port identifier (string), passanger class (int between 1 and 3) and the
passanger id. The multi-level indexing is preserved when we get a single
series from the frame.
As our last example, we look at various ways of aggregating the groups:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: |
|
The second snippet combines a number of useful functions. It uses Frame.getNumericColumns
to obtain just numerical columns from a data frame. Then it drops the non-numerical columns
using Series.dropMissing
. Then we use Series.mapValues
to apply the averaging operation
to all columns.
The last snippet is alo interesting. We get the "Survived" column (which
contains Boolean values) and we aggregate each group using a specified function.
The function is composed from three components - it first gets the values in the
group, counts them (to get a number of true
and false
values) and then creates
a series with the results. The result looks as the following table (some values
were omitted):
1: 2: 3: 4: 5: 6: 7: |
|
Summarizing data with pivot table
In the previous section, we looked at grouping, which is a very general data manipulation operation. However, very often we want to perform two operations at the same time - group the data by certain keys and produce an aggregate. This combination is captured by the concept of a pivot table.
A pivot table is a useful tool if you want to summarize data in the frame based on two keys that are available in the rows of the data frame.
For example, given the titanic data set that we loaded earlier and explored in the previous section, we might want to compare the survival rate for males and females. The pivot table makes this possible using just a single call:
1: 2: 3: 4: 5: 6: 7: 8: |
|
The pivotTable
function (and the corresponding PivotTable
method) take three arguments.
The first two specify functions that, given a row in the original frame, return a new
row key and column key, respectively. In the above example, the new row key is
the Sex
value and the new column key is whether a person survived or not. As a result
we get the following two by two table:
False |
True |
|
---|---|---|
male |
468 |
109 |
female |
81 |
233 |
Note, we could also use the PivotTable
member method along with a type annotation on the
result for readability:
1: 2: |
|
The pivot table operation takes the source frame, partitions the data (rows) based on the
new row and column keys and then aggregates each frame using the specified aggregation. In the
above example, we used Frame.countRows
to simply return number of people in each sub-group.
However, we could easily calculate other statistic - such as average age:
1: 2: 3: 4: 5: 6: |
|
The results suggest that older males were less likely survive than younger males, but older females were more likely to survive then younger females:
False |
True |
|
---|---|---|
male |
32 |
27 |
female |
25 |
29 |
Hierarchical indexing
For some data sets, the index is not a simple sequence of keys, but instead a more complex hierarchy. This can be captured using hierarchical indices. They also provide a convenient way of dealing with multi-dimensional data. The most common source of multi-level indices is grouping (the previous section has a number of examples).
Lookup in the World Bank data set
In this section, we start by looking at the World Bank data set from earlier. It is a data frame with two-level hierarchy of columns, where the first level is the name of region and the second level is the name of country.
Basic lookup can be performed using slicing operators. The following are only available in F# 3.1:
1: 2: 3: 4: 5: 6: |
|
In F# 3.0, you can use a family of helper functions LookupXOfY
as follows:
1: 2: 3: 4: |
|
The lookup operations always return data frame of the same type as the original frame. This means that even if you select one sub-group, you get back a frame with the same multi-level hierarchy of keys. This can be easily changed using projection on keys:
1: 2: 3: 4: |
|
Grouping and aggregating World Bank data
Hierarchical keys are often created as a result of grouping. For example, we can group the rows (representing individual years) in the Euro zone data set by decades (for more information about grouping see also grouping section in this document).
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: |
|
Now that we have a data frame with hierarchical index, we can select data in a single group, such as 1990s. The result is a data frame of the same type. We can also multiply the values, to get original GDP in USD (rather than billions):
1:
|
|
The Frame
and Series
modules provide a number of functions for aggregating the
groups. We can access a specific country and aggregate GDP for a country, or we can
apply aggregation to the entire data set:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: |
|
So far, we were working with data frames that only had one hierarchical index. However, it is perfectly possible to have hierarchical index for both rows and columns. The following snippet groups countries by their average GDP (in addition to grouping rows by decades):
1: 2: 3: 4: |
|
You can see (by hovering over byGDP
) that the two hierarchies are captured in the type.
The column key is bool * string
(rich? and name) and the row key is string * int
(decade, year). This creates two groups of columns. One containing France, Germany and
Italy and the other containing remaining countries.
The aggregations are only (directly) supported on rows, but we can use Frame.transpose
to switch between rows and columns.
Handling missing values
THe support for missing values is built-in, which means that any series or frame can
contain missing values. When constructing series or frames from data, certain values
are automatically treated as "missing values". This includes Double.NaN
, null
values
for reference types and for nullable types:
1:
|
|
Keys |
0 |
1 |
2 |
---|---|---|---|
Values |
N/A |
1 |
3.14 |
1: 2: |
|
Keys |
0 |
1 |
2 |
---|---|---|---|
Values |
1 |
N/A |
3 |
Missing values are automatically skipped when performing statistical computations such
as Series.mean
. They are also ignored by projections and filtering, including
Series.mapValues
. When you want to handle missing values, you can use Series.mapAll
that gets the value as option<T>
(we use sample data set from earlier section):
1: 2: 3: 4: 5: 6: |
|
In practice, you will not need to use Series.mapAll
very often, because the
series module provides functions that fill missing values more easily:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: |
|
Various other strategies for handling missing values are not currently directly
supported by the library, but can be easily added using Series.fillMissingUsing
.
It takes a function and calls it on all missing values. If we have an interpolation
function, then we can pass it to fillMissingUsing
and perform any interpolation
needed.
For example, the following snippet gets the previous and next values and averages them (if they are available) or returns one of them (or zero if there are no values at all):
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: |
|
namespace FSharp
--------------------
namespace Microsoft.FSharp
namespace FSharp.Data
--------------------
namespace Microsoft.FSharp.Data
static member GetDataContext : unit -> WorldBankDataService
nested type ServiceTypes
<summary>Typed representation of WorldBank data. See http://www.worldbank.org for terms and conditions.</summary>
module Frame
from Deedle
--------------------
type Frame =
static member ReadCsv : stream:Stream * hasHeaders:Nullable<bool> * inferTypes:Nullable<bool> * inferRows:Nullable<int> * schema:string * separators:string * culture:string * maxRows:Nullable<int> * missingValues:string [] * preferOptions:Nullable<bool> -> Frame<int,string>
static member ReadCsv : location:string * hasHeaders:Nullable<bool> * inferTypes:Nullable<bool> * inferRows:Nullable<int> * schema:string * separators:string * culture:string * maxRows:Nullable<int> * missingValues:string [] * preferOptions:bool -> Frame<int,string>
static member ReadReader : reader:IDataReader -> Frame<int,string>
static member CustomExpanders : Dictionary<Type,Func<obj,seq<string * Type * obj>>>
static member NonExpandableInterfaces : ResizeArray<Type>
static member NonExpandableTypes : HashSet<Type>
--------------------
type Frame<'TRowKey,'TColumnKey (requires equality and equality)> =
interface IDynamicMetaObjectProvider
interface INotifyCollectionChanged
interface IFsiFormattable
interface IFrame
new : names:seq<'TColumnKey> * columns:seq<ISeries<'TRowKey>> -> Frame<'TRowKey,'TColumnKey>
new : rowIndex:IIndex<'TRowKey> * columnIndex:IIndex<'TColumnKey> * data:IVector<IVector> * indexBuilder:IIndexBuilder * vectorBuilder:IVectorBuilder -> Frame<'TRowKey,'TColumnKey>
member AddColumn : column:'TColumnKey * series:ISeries<'TRowKey> -> unit
member AddColumn : column:'TColumnKey * series:seq<'V> -> unit
member AddColumn : column:'TColumnKey * series:ISeries<'TRowKey> * lookup:Lookup -> unit
member AddColumn : column:'TColumnKey * series:seq<'V> * lookup:Lookup -> unit
...
--------------------
new : names:seq<'TColumnKey> * columns:seq<ISeries<'TRowKey>> -> Frame<'TRowKey,'TColumnKey>
new : rowIndex:Indices.IIndex<'TRowKey> * columnIndex:Indices.IIndex<'TColumnKey> * data:IVector<IVector> * indexBuilder:Indices.IIndexBuilder * vectorBuilder:Vectors.IVectorBuilder -> Frame<'TRowKey,'TColumnKey>
static member Frame.ReadCsv : stream:Stream * ?hasHeaders:bool * ?inferTypes:bool * ?inferRows:int * ?schema:string * ?separators:string * ?culture:string * ?maxRows:int * ?missingValues:string [] * ?preferOptions:bool -> Frame<int,string>
static member Frame.ReadCsv : reader:TextReader * ?hasHeaders:bool * ?inferTypes:bool * ?inferRows:int * ?schema:string * ?separators:string * ?culture:string * ?maxRows:int * ?missingValues:string [] * ?preferOptions:bool -> Frame<int,string>
static member Frame.ReadCsv : stream:Stream * hasHeaders:Nullable<bool> * inferTypes:Nullable<bool> * inferRows:Nullable<int> * schema:string * separators:string * culture:string * maxRows:Nullable<int> * missingValues:string [] * preferOptions:Nullable<bool> -> Frame<int,string>
static member Frame.ReadCsv : location:string * hasHeaders:Nullable<bool> * inferTypes:Nullable<bool> * inferRows:Nullable<int> * schema:string * separators:string * culture:string * maxRows:Nullable<int> * missingValues:string [] * preferOptions:bool -> Frame<int,string>
static member Frame.ReadCsv : path:string * indexCol:string * ?hasHeaders:bool * ?inferTypes:bool * ?inferRows:int * ?schema:string * ?separators:string * ?culture:string * ?maxRows:int * ?missingValues:string [] * ?preferOptions:bool -> Frame<'R,string> (requires equality)
type DateTime =
struct
new : ticks:int64 -> DateTime + 10 overloads
member Add : value:TimeSpan -> DateTime
member AddDays : value:float -> DateTime
member AddHours : value:float -> DateTime
member AddMilliseconds : value:float -> DateTime
member AddMinutes : value:float -> DateTime
member AddMonths : months:int -> DateTime
member AddSeconds : value:float -> DateTime
member AddTicks : value:int64 -> DateTime
member AddYears : value:int -> DateTime
...
end
--------------------
DateTime ()
(+0 other overloads)
DateTime(ticks: int64) : DateTime
(+0 other overloads)
DateTime(ticks: int64, kind: DateTimeKind) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, calendar: Globalization.Calendar) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, kind: DateTimeKind) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, calendar: Globalization.Calendar) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, millisecond: int) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, millisecond: int, kind: DateTimeKind) : DateTime
(+0 other overloads)
static member FrameExtensions.SaveCsv : frame:Frame<'R,'C> * path:string * keyNames:seq<string> * separator:char * culture:Globalization.CultureInfo -> unit (requires equality and equality)
static member FrameExtensions.SaveCsv : frame:Frame<'R,'C> * writer:TextWriter * includeRowKeys:bool * keyNames:seq<string> * separator:char * culture:Globalization.CultureInfo -> unit (requires equality and equality)
static member FrameExtensions.SaveCsv : frame:Frame<'R,'C> * path:string * includeRowKeys:bool * keyNames:seq<string> * separator:char * culture:Globalization.CultureInfo -> unit (requires equality and equality)
member Frame.SaveCsv : writer:TextWriter * ?includeRowKeys:bool * ?keyNames:seq<string> * ?separator:char * ?culture:Globalization.CultureInfo -> unit
member Frame.SaveCsv : path:string * ?includeRowKeys:bool * ?keyNames:seq<string> * ?separator:char * ?culture:Globalization.CultureInfo -> unit
static val DirectorySeparatorChar : char
static val AltDirectorySeparatorChar : char
static val VolumeSeparatorChar : char
static val InvalidPathChars : char[]
static val PathSeparator : char
static member ChangeExtension : path:string * extension:string -> string
static member Combine : [<ParamArray>] paths:string[] -> string + 3 overloads
static member GetDirectoryName : path:string -> string
static member GetExtension : path:string -> string
static member GetFileName : path:string -> string
...
{Name: string;
Age: int;
Countries: string list;}
val string : value:'T -> string
--------------------
type string = String
val int : value:'T -> int (requires member op_Explicit)
--------------------
type int = int32
--------------------
type int<'Measure> = int
static member Frame.ofRecords : values:seq<'T> -> Frame<int,string>
static member Frame.ofRecords : values:Collections.IEnumerable * indexCol:string -> Frame<'R,string> (requires equality)
member Frame.GetColumn : column:'TColumnKey * lookup:Lookup -> Series<'TRowKey,'R>
Given a region, load GDP in current LCU and return data as
a frame with two-level column key (region and country name)
nested type Countries
nested type Country
nested type Indicators
nested type IndicatorsDescriptions
nested type Region
nested type Regions
nested type Topic
nested type Topics
nested type WorldBankDataService
<summary>Contains the types that describe the data service</summary>
inherit Region
member Countries : Countries
member Indicators : Indicators
member Name : string
member RegionCode : string
<summary>The indicators for the region</summary>
module Series
from Deedle
--------------------
type Series =
static member ofNullables : values:seq<Nullable<'a0>> -> Series<int,'a0> (requires default constructor and value type and 'a0 :> ValueType)
static member ofObservations : observations:seq<'c * 'd> -> Series<'c,'d> (requires equality)
static member ofOptionalObservations : observations:seq<'K * 'a1 option> -> Series<'K,'a1> (requires equality)
static member ofValues : values:seq<'a> -> Series<int,'a>
--------------------
type Series<'K,'V (requires equality)> =
interface IFsiFormattable
interface ISeries<'K>
new : pairs:seq<KeyValuePair<'K,'V>> -> Series<'K,'V>
new : keys:'K [] * values:'V [] -> Series<'K,'V>
new : keys:seq<'K> * values:seq<'V> -> Series<'K,'V>
new : index:IIndex<'K> * vector:IVector<'V> * vectorBuilder:IVectorBuilder * indexBuilder:IIndexBuilder -> Series<'K,'V>
member After : lowerExclusive:'K -> Series<'K,'V>
member Aggregate : aggregation:Aggregation<'K> * observationSelector:Func<DataSegment<Series<'K,'V>>,KeyValuePair<'TNewKey,OptionalValue<'R>>> -> Series<'TNewKey,'R> (requires equality)
member Aggregate : aggregation:Aggregation<'K> * keySelector:Func<DataSegment<Series<'K,'V>>,'TNewKey> * valueSelector:Func<DataSegment<Series<'K,'V>>,OptionalValue<'R>> -> Series<'TNewKey,'R> (requires equality)
member AsyncMaterialize : unit -> Async<Series<'K,'V>>
...
--------------------
new : pairs:seq<Collections.Generic.KeyValuePair<'K,'V>> -> Series<'K,'V>
new : keys:seq<'K> * values:seq<'V> -> Series<'K,'V>
new : keys:'K [] * values:'V [] -> Series<'K,'V>
new : index:Indices.IIndex<'K> * vector:IVector<'V> * vectorBuilder:Vectors.IVectorBuilder * indexBuilder:Indices.IIndexBuilder -> Series<'K,'V>
<summary>The indicators for the country</summary>
GDP at purchaser's prices is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products. It is calculated without making deductions for depreciation of fabricated assets or for depletion and degradation of natural resources. Data are in current local currency.
The data for region 'Euro area'
The data for region 'OECD members'
member Frame.Join : colKey:'TColumnKey * series:Series<'TRowKey,'V> -> Frame<'TRowKey,'TColumnKey>
member Frame.Join : otherFrame:Frame<'TRowKey,'TColumnKey> * kind:JoinKind -> Frame<'TRowKey,'TColumnKey>
member Frame.Join : colKey:'TColumnKey * series:Series<'TRowKey,'V> * kind:JoinKind -> Frame<'TRowKey,'TColumnKey>
member Frame.Join : otherFrame:Frame<'TRowKey,'TColumnKey> * kind:JoinKind * lookup:Lookup -> Frame<'TRowKey,'TColumnKey>
member Frame.Join : colKey:'TColumnKey * series:Series<'TRowKey,'V> * kind:JoinKind * lookup:Lookup -> Frame<'TRowKey,'TColumnKey>
member Series.Get : key:'K * lookup:Lookup -> 'V
member Series.TryGet : key:'K * lookup:Lookup -> OptionalValue<'V>
member ObjectSeries.GetAs : column:'K * fallback:'R -> 'R
interface
abstract member Age : int
abstract member Countries : string list
end
Expected columns & their types in a row
interface
abstract member Age : OptionalValue<int>
abstract member Countries : string list
end
Alternative that lets us handle missing 'Age' values
module OptionalValue
from Deedle
--------------------
type OptionalValue
--------------------
type OptionalValue<'T> =
struct
new : value:'T -> OptionalValue<'T>
private new : hasValue:bool * value:'T -> OptionalValue<'T>
override Equals : y:obj -> bool
override GetHashCode : unit -> int
override ToString : unit -> string
member HasValue : bool
member Value : 'T
member ValueOrDefault : 'T
static member Missing : OptionalValue<'T>
end
--------------------
OptionalValue ()
new : value:'T -> OptionalValue<'T>
member Frame.ReplaceColumn : column:'TColumnKey * series:ISeries<'TRowKey> -> unit
member Frame.ReplaceColumn : column:'TColumnKey * data:seq<'V> * lookup:Lookup -> unit
member Frame.ReplaceColumn : column:'TColumnKey * series:ISeries<'TRowKey> * lookup:Lookup -> unit
member Frame.Merge : otherFrames:seq<Frame<'TRowKey,'TColumnKey>> -> Frame<'TRowKey,'TColumnKey>
member Frame.Merge : otherFrame:Frame<'TRowKey,'TColumnKey> -> Frame<'TRowKey,'TColumnKey>
static member FrameExtensions.Merge : frame:Frame<'TRowKey,'TColumnKey> * rowKey:'TRowKey * row:ISeries<'TColumnKey> -> Frame<'TRowKey,'TColumnKey> (requires equality and equality)
type SeriesBuilder<'K (requires equality)> =
inherit SeriesBuilder<'K,obj>
new : unit -> SeriesBuilder<'K>
--------------------
type SeriesBuilder<'K,'V (requires equality and equality)> =
interface IDynamicMetaObjectProvider
interface IDictionary<'K,'V>
interface seq<KeyValuePair<'K,'V>>
interface IEnumerable
new : unit -> SeriesBuilder<'K,'V>
member Add : key:'K * value:'V -> unit
member Series : Series<'K,'V>
static member ( ?<- ) : builder:SeriesBuilder<string,'V> * name:string * value:'V -> unit
--------------------
new : unit -> SeriesBuilder<'K>
--------------------
new : unit -> SeriesBuilder<'K,'V>
| Exact = 1
| ExactOrGreater = 3
| ExactOrSmaller = 5
| Greater = 2
| Smaller = 4
module List
from Microsoft.FSharp.Collections
--------------------
type List<'T> =
| ( [] )
| ( :: ) of Head: 'T * Tail: 'T list
interface IReadOnlyList<'T>
interface IReadOnlyCollection<'T>
interface IEnumerable
interface IEnumerable<'T>
member GetSlice : startIndex:int option * endIndex:int option -> 'T list
member Head : 'T
member IsEmpty : bool
member Item : index:int -> 'T with get
member Length : int
member Tail : 'T list
...
from Microsoft.FSharp.Collections
static member Frame.ofRows : rows:Series<'R,#ISeries<'C>> -> Frame<'R,'C> (requires equality and equality)
val decimal : value:'T -> decimal (requires member op_Explicit)
--------------------
type decimal = Decimal
--------------------
type decimal<'Measure> = decimal
from Deedle
static member count : frame:Frame<'R,'C> -> Series<'C,int> (requires equality and equality)
static member count : series:Series<'K,'V> -> int (requires equality)
static member describe : series:Series<'K,'V> -> Series<string,float> (requires equality and equality)
static member expandingCount : series:Series<'K,'V> -> Series<'K,float> (requires equality)
static member expandingKurt : series:Series<'K,'V> -> Series<'K,float> (requires equality)
static member expandingMax : series:Series<'K,'V> -> Series<'K,float> (requires equality)
static member expandingMean : series:Series<'K,'V> -> Series<'K,float> (requires equality)
static member expandingMin : series:Series<'K,'V> -> Series<'K,float> (requires equality)
static member expandingSkew : series:Series<'K,'V> -> Series<'K,float> (requires equality)
static member expandingStdDev : series:Series<'K,'V> -> Series<'K,float> (requires equality)
...
static member Frame.ofColumns : cols:seq<'C * #ISeries<'R>> -> Frame<'R,'C> (requires equality and equality)
member Frame.PivotTable : r:'TColumnKey * c:'TColumnKey * op:(Frame<'TRowKey,'TColumnKey> -> 'T) -> Frame<'R,'C> (requires equality and equality and equality and equality)
static member Stats.mean : series:Series<'K,'V> -> float (requires equality)
val float : value:'T -> float (requires member op_Explicit)
--------------------
type float = Double
--------------------
type float<'Measure> = float
struct
member CompareTo : value:obj -> int + 1 overload
member Equals : obj:obj -> bool + 1 overload
member GetHashCode : unit -> int
member GetTypeCode : unit -> TypeCode
member ToString : unit -> string + 3 overloads
static val MinValue : float
static val MaxValue : float
static val Epsilon : float
static val NegativeInfinity : float
static val PositiveInfinity : float
...
end
type Nullable =
static member Compare<'T> : n1:Nullable<'T> * n2:Nullable<'T> -> int
static member Equals<'T> : n1:Nullable<'T> * n2:Nullable<'T> -> bool
static member GetUnderlyingType : nullableType:Type -> Type
--------------------
type Nullable<'T (requires default constructor and value type and 'T :> ValueType)> =
struct
new : value:'T -> Nullable<'T>
member Equals : other:obj -> bool
member GetHashCode : unit -> int
member GetValueOrDefault : unit -> 'T + 1 overload
member HasValue : bool
member ToString : unit -> string
member Value : 'T
end
--------------------
Nullable ()
Nullable(value: 'T) : Nullable<'T>
| Backward = 0
| Forward = 1