F# Frame extensions Module
This module contains F# functions and extensions for working with frames. This includes operations for creating frames such as the `frame` function, `=>` operator and `Frame.ofRows`, `Frame.ofColumns` and `Frame.ofRowKeys` functions. The module also provides additional F# extension methods including `ReadCsv`, `SaveCsv` and `PivotTable`.
Frame construction:
The functions and methods in this group can be used to create frames. If you are creating a frame from a number of sample values, you can use `frame` and the `=>` operator (or the `=?>` opreator which is useful if you have multiple series of distinct types):
frame [ "Column 1" => series [ 1 => 1.0; 2 => 2.0 ] "Column 2" => series [ 3 => 3.0 ] ]
Aside from this, the various type extensions let you write `Frame.ofXyz` to construct frames from data in various formats - `Frame.ofRows` and `Frame.ofColumns` create frame from a series or a sequence of rows or columns; `Frame.ofRecords` creates a frame from .NET objects using Reflection and `Frame.ofRowKeys` creates an empty frame with the specified keys.
Frame operations:
The group contains two overloads of the F#-friendly version of the `PivotTable` method.
Input and output:
This group of extensions includes a number of overloads for the `ReadCsv` and `SaveCsv` methods. The methods here are designed to be used from F# and so they are F#-style extensions and they use F#-style optional arguments. In general, the overlads take either a path or `TextReader`/`TextWriter`. Also note that `ReadCsv<'R>(path, indexCol, ...)` lets you specify the column to be used as the index.
Table of contents
Frame construction
Functions and values
| Function or value |
Description
|
Full Usage:
a => b
Parameters:
'a
b : 'b
Returns: 'a * 'b
Type parameters: 'a, 'b |
Custom operator that can be used when constructing series from observations or frames from key-row or key-column pairs. The operator simply returns a tuple, but it provides a more convenient syntax. For example: series [ "k1" => 1; "k2" => 15 ]
|
Custom operator that can be used when constructing a frame from observations of series. The operator simply returns a tuple, but it upcasts the series argument so you don't have to do manual casting. For example: frame [ "k1" =?> series [0 => "a"]; "k2" =?> series ["x" => "y"] ]
|
|
Full Usage:
frame columns
Parameters:
('a * 'b) seq
Returns: Frame<'c, 'a>
Type parameters: 'a, 'b, 'c (requires equality and :> Deedle.ISeries<'c> and equality) |
A function for constructing data frame from a sequence of name - column pairs. This provides a nicer syntactic sugar for `Frame.ofColumns`.
ExampleTo create a simple frame with two columns, you can write:
|
Type extensions
| Type extension |
Description
|
Full Usage:
Frame.ofArray2D array
Parameters:
'T[,]
-
A two-dimensional array to be converted into a data frame
Returns: Frame<int, int>
Type parameters: 'T |
Create data frame from a 2D array of values. The first dimension of the array is used as rows and the second dimension is treated as columns. Rows and columns of the returned frame are indexed with the element's offset in the array.
Extended Type:
|
|
|
Full Usage:
Frame.ofColumns cols
Parameters:
('C * 'a) seq
Returns: Frame<'R, 'C>
Type parameters: 'C, 'a, 'R (requires equality and :> Deedle.ISeries<'R> and equality) |
|
Full Usage:
Frame.ofJaggedArray jArray
Parameters:
'T[][]
-
A jagged array to be converted into a data frame
Returns: Frame<int, int>
Type parameters: 'T |
Create data frame from a jagged array of values. The first dimension of the array is used as rows and the second dimension is treated as columns. Rows and columns of the returned frame are indexed with the element's offset in the array. Please note that this function will fail when the inner arrays of the input do not have the same lengths.
Extended Type:
|
|
|
Full Usage:
Frame.ofRecords values
Parameters:
'T seq
Returns: Frame<int, string>
Type parameters: 'T |
|
Full Usage:
Frame.ofRecords (values, indexCol)
Parameters:
IEnumerable
indexCol : string
Returns: Frame<'R, string>
Type parameters: 'R (requires equality) |
Creates a data frame from a sequence of any .NET objects. The method uses reflection over the specified type parameter `'T` and turns its properties to columns.
Extended Type:
|
Full Usage:
Frame.ofRowKeys keys
Parameters:
'R seq
Returns: Frame<'R, string>
Type parameters: 'R (requires equality) |
|
Full Usage:
Frame.ofRows rows
Parameters:
('R * 'a) seq
Returns: Frame<'R, 'C>
Type parameters: 'R, 'a, 'C (requires equality and :> Deedle.ISeries<'C> and equality) |
|
|
|
Full Usage:
Frame.ofRowsOrdinal rows
Parameters:
'a seq
Returns: Frame<int64, 'K>
Type parameters: 'a, 'K, 'V (requires :> Deedle.Series<'K,'V> and equality) |
|
Full Usage:
Frame.ofValues values
Parameters:
('R * 'C * 'V) seq
Returns: Frame<'R, 'C>
Type parameters: 'R, 'C, 'V (requires equality and equality) |
Frame operations
Type extensions
| Type extension |
Description
|
Full Usage:
this.PivotTable
Parameters:
'TColumnKey
-
A column key to group on for the resulting row index
c : 'TColumnKey
-
A column key to group on for the resulting col index
op : Frame<'TRowKey, 'TColumnKey> -> 'T
-
A function computing a value from the corresponding bucket frame
Returns: Frame<'R, 'C>
Type parameters: 'R, 'C, 'T (requires equality and equality) |
Creates a new data frame resulting from a 'pivot' operation. Consider a denormalized data frame representing a table: column labels are field names & table values are observations of those fields. pivotTable buckets the rows along two axes, according to the values of the columns `r` and `c`; and then computes a value for the frame of rows that land in each bucket.
Extended Type:
|
Input and output
Type extensions
| Type extension |
Description
|
Full Usage:
Frame.ReadCsv(path, indexCol, ?hasHeaders, ?inferTypes, ?inferRows, ?schema, ?separators, ?culture, ?maxRows, ?missingValues, ?preferOptions, ?typeResolver, ?encoding)
Parameters:
string
-
Specifies a file name or an web location of the resource.
indexCol : string
-
Specifies the column that should be used as an index in the resulting frame. The type is specified via a type parameter, e.g. use Frame.ReadCsv<int>("file.csv", indexCol="Day").
?hasHeaders : bool
-
Specifies whether the input CSV file has header row
?inferTypes : bool
-
Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema)
?inferRows : int
-
If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 100. Value 0 means all rows.
?schema : string
-
A string that specifies CSV schema. See the documentation for information about the schema format.
?separators : string
-
A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example `";"` to parse semicolon separated files.
?culture : string
-
Specifies the name of the culture that is used when parsing values in the CSV file (such as `"en-US"`). The default is invariant culture.
?maxRows : int
-
The maximal number of rows that should be read from the CSV file.
?missingValues : string[]
-
An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
?preferOptions : bool
-
Specifies whether to prefer optional values when parsing CSV data.
?typeResolver : string -> string option
-
An optional function that maps a column name to a type name string (e.g. `"int"`, `"float"`, `"string"`, `"bool"`, `"date"`, `"guid"`). Return `None` to let Deedle infer the type for that column. When both `typeResolver` and `schema` are provided, explicit `schema` overrides take precedence for any conflicting column.
?encoding : Encoding
-
Specifies the character encoding to use when reading the CSV file. When not set, UTF-8 with BOM detection is used.
Returns: Frame<'R, string>
Type parameters: 'R (requires equality) |
Load data frame from a CSV file. The operation automatically reads column names from the CSV file (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically.
Extended Type:
|
Full Usage:
Frame.ReadCsv(path, ?hasHeaders, ?inferTypes, ?inferRows, ?schema, ?separators, ?culture, ?maxRows, ?missingValues, ?preferOptions, ?typeResolver, ?encoding)
Parameters:
string
-
Specifies a file name or an web location of the resource.
?hasHeaders : bool
-
Specifies whether the input CSV file has header row
?inferTypes : bool
-
Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema)
?inferRows : int
-
If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 100.
?schema : string
-
A string that specifies CSV schema. See the documentation for information about the schema format.
?separators : string
-
A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example `";"` to parse semicolon separated files.
?culture : string
-
Specifies the name of the culture that is used when parsing values in the CSV file (such as `"en-US"`). The default is invariant culture.
?maxRows : int
-
The maximal number of rows that should be read from the CSV file.
?missingValues : string[]
-
An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
?preferOptions : bool
-
Specifies whether to prefer optional values when parsing CSV data.
?typeResolver : string -> string option
-
An optional function that maps a column name to a type name string (e.g. `"int"`, `"float"`, `"string"`, `"bool"`, `"date"`, `"guid"`). Return `None` to let Deedle infer the type for that column. When both `typeResolver` and `schema` are provided, explicit `schema` overrides take precedence for any conflicting column.
?encoding : Encoding
-
Specifies the character encoding to use when reading the CSV file. When not set, UTF-8 with BOM detection is used.
Returns: Frame<int, string>
|
Load data frame from a CSV file. The operation automatically reads column names from the CSV file (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically.
Extended Type:
|
Full Usage:
Frame.ReadCsv(stream, ?hasHeaders, ?inferTypes, ?inferRows, ?schema, ?separators, ?culture, ?maxRows, ?missingValues, ?preferOptions, ?typeResolver, ?encoding)
Parameters:
Stream
-
Specifies the input stream, opened at the beginning of CSV data
?hasHeaders : bool
-
Specifies whether the input CSV file has header row
?inferTypes : bool
-
Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema)
?inferRows : int
-
If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 100.
?schema : string
-
A string that specifies CSV schema. See the documentation for information about the schema format.
?separators : string
-
A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example `";"` to parse semicolon separated files.
?culture : string
-
Specifies the name of the culture that is used when parsing values in the CSV file (such as `"en-US"`). The default is invariant culture.
?maxRows : int
-
The maximal number of rows that should be read from the CSV file.
?missingValues : string[]
-
An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
?preferOptions : bool
-
Specifies whether to prefer optional values when parsing CSV data.
?typeResolver : string -> string option
-
An optional function that maps a column name to a type name string (e.g. `"int"`, `"float"`, `"string"`, `"bool"`, `"date"`, `"guid"`). Return `None` to let Deedle infer the type for that column. When both `typeResolver` and `schema` are provided, explicit `schema` overrides take precedence for any conflicting column.
?encoding : Encoding
-
Specifies the character encoding to use when reading the CSV stream. When not set, UTF-8 with BOM detection is used.
Returns: Frame<int, string>
|
Load data frame from a CSV file. The operation automatically reads column names from the CSV file (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically.
Extended Type:
|
Full Usage:
Frame.ReadCsv(reader, ?hasHeaders, ?inferTypes, ?inferRows, ?schema, ?separators, ?culture, ?maxRows, ?missingValues, ?preferOptions, ?typeResolver)
Parameters:
TextReader
-
Specifies the `TextReader`, positioned at the beginning of CSV data
?hasHeaders : bool
-
Specifies whether the input CSV file has header row
?inferTypes : bool
-
Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema)
?inferRows : int
-
If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 100.
?schema : string
-
A string that specifies CSV schema. See the documentation for information about the schema format.
?separators : string
-
A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example `";"` to parse semicolon separated files.
?culture : string
-
Specifies the name of the culture that is used when parsing values in the CSV file (such as `"en-US"`). The default is invariant culture.
?maxRows : int
-
The maximal number of rows that should be read from the CSV file.
?missingValues : string[]
-
An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
?preferOptions : bool
-
Specifies whether to prefer optional values when parsing CSV data.
?typeResolver : string -> string option
-
An optional function that maps a column name to a type name string (e.g. `"int"`, `"float"`, `"string"`, `"bool"`, `"date"`, `"guid"`). Return `None` to let Deedle infer the type for that column. When both `typeResolver` and `schema` are provided, explicit `schema` overrides take precedence for any conflicting column.
Returns: Frame<int, string>
|
Load data frame from a CSV file. The operation automatically reads column names from the CSV file (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically.
Extended Type:
|
Full Usage:
Frame.ReadCsvString(csvString, ?hasHeaders, ?inferTypes, ?inferRows, ?schema, ?separators, ?culture, ?maxRows, ?missingValues, ?preferOptions, ?typeResolver)
Parameters:
string
-
Specifies the input string containing the CSV
?hasHeaders : bool
-
Specifies whether the input CSV string has header row
?inferTypes : bool
-
Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema)
?inferRows : int
-
If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 100.
?schema : string
-
A string that specifies CSV schema. See the documentation for information about the schema format.
?separators : string
-
A string that specifies one or more (single character) separators that are used to separate columns in the CSV string. Use for example `";"` to parse semicolon separated files.
?culture : string
-
Specifies the name of the culture that is used when parsing values in the CSV string (such as `"en-US"`). The default is invariant culture.
?maxRows : int
-
The maximal number of rows that should be read from the CSV string.
?missingValues : string[]
-
An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
?preferOptions : bool
-
Specifies whether to prefer optional values when parsing CSV data.
?typeResolver : string -> string option
-
An optional function that maps a column name to a type name string (e.g. `"int"`, `"float"`, `"string"`, `"bool"`, `"date"`, `"guid"`). Return `None` to let Deedle infer the type for that column. When both `typeResolver` and `schema` are provided, explicit `schema` overrides take precedence for any conflicting column.
Returns: Frame<int, string>
|
Load data frame from a string representing a UTF8-encoded CSV file. The operation automatically reads column names from the string (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically.
Extended Type:
|
Full Usage:
this.SaveCsv
Parameters:
TextWriter
-
Specifies the TextWriter to which the CSV data should be written
?includeRowKeys : bool
-
When set to `true`, the row key is also written to the output file
?keyNames : string seq
-
Can be used to specify the CSV headers for row key (or keys, for multi-level index)
?separator : char
-
Specify the column separator in the file (the default is `\t` for TSV files and `,` for CSV files)
?culture : CultureInfo
-
Specify the `CultureInfo` object used for formatting numerical data
|
Save data frame to a CSV file or a `TextWriter`. When calling the operation, you can specify whether you want to save the row keys or not (and headers for the keys) and you can also specify the separator (use `\t` for writing TSV files). When specifying file name ending with `.tsv`, the `\t` separator is used automatically.
Extended Type:
|
Full Usage:
this.SaveCsv
Parameters:
string
-
Specifies the output file name where the CSV data should be written
?includeRowKeys : bool
-
When set to `true`, the row key is also written to the output file
?keyNames : string seq
-
Can be used to specify the CSV headers for row key (or keys, for multi-level index)
?separator : char
-
Specify the column separator in the file (the default is `\t` for TSV files and `,` for CSV files)
?culture : CultureInfo
-
Specify the `CultureInfo` object used for formatting numerical data
|
Save data frame to a CSV file or a `TextWriter`. When calling the operation, you can specify whether you want to save the row keys or not (and headers for the keys) and you can also specify the separator (use `\t` for writing TSV files). When specifying file name ending with `.tsv`, the `\t` separator is used automatically.
Extended Type:
|
Full Usage:
this.SaveCsv
Parameters:
string
-
Specifies the output file name where the CSV data should be written
keyNames : string seq
-
Specifies the CSV headers for row key (or keys, for multi-level index)
|
Save data frame to a CSV file or to a `TextWriter`. When calling the operation, you can specify whether you want to save the row keys or not (and headers for the keys) and you can also specify the separator (use `\t` for writing TSV files). When specifying file name ending with `.tsv`, the `\t` separator is used automatically.
Extended Type:
|
Full Usage:
this.SaveJson
Parameters:
TextWriter
-
The TextWriter to write the JSON to.
?orient : string
-
Controls the JSON layout (see ToJson).
|
Save the data frame as a JSON file.
Extended Type:
|
Full Usage:
this.SaveJson
Parameters:
string
-
The output file path.
?orient : string
-
Controls the JSON layout (see ToJson).
|
Save the data frame as a JSON file.
Extended Type:
|
Full Usage:
this.ToDataTable
Parameters:
string seq
-
Specifies the names of the row key components (or just a single row key name if the row index is not hierarchical).
Returns: DataTable
|
Returns the data of the frame as a .NET `DataTable` object. The column keys are automatically converted to strings that are used as column names. The row index is turned into an additional column with the specified name (the function takes the name as a sequence to support hierarchical keys, but typically you can write just `frame.ToDataTable(["KeyName"])`.
Extended Type:
|
Full Usage:
this.ToJson
Parameters:
string
-
Controls the JSON layout. Allowed values:
"columns" (default) — column-major {"col":{"row":v}};
"index" — row-major {"row":{"col":v}};
"records" — array of row objects [{"col":v}].
Returns: string
|
Serialize the data frame to a JSON string.
Extended Type:
|
Deedle