Name
SPVM::R::DataFrame - Data Frame
Description
R::DataFrame class in SPVM represents a data frame.
Usage
use R::DataFrame;
use R::OP::DataFrame as DFOP;
use R::OP::Int as IOP;
use R::OP::Double as DOP;
use R::OP::String as STROP;
use R::OP::Time::Piece as TPOP;
# Create a R::DataFrame object
my $data_frame = R::DataFrame->new;
$data_frame->set_col("name", STROP->c(["Ken", "Yuki", "Mike"]));
$data_frame->set_col("age", IOP->c([19, 43, 50]));
$data_frame->set_col("weight", DOP->c([(double)50.6, 60.3, 80.5]));
$data_frame->set_col("birth", TPOP->c(["1980-10-10", "1985-12-10", "1970-02-16"]));
my $data_frame_string = $data_frame->to_string;
my $slice_data_frame = $data_frame->slice(["name", "age"], [IOP->c([0, 1])]);
$data_frame->sort(["name", "age desc"]);
my $nrow = $data_frame->nrow;
my $ncol = $data_frame->ncol;
my $dim = $data_frame->first_col->dim;
See also Data Frame Examples.
Details
This class is a port of data frame features of R language.
Complexity of Getting, Setting, Inserting and Removing a Column
The complexity of getting a new column is O(1).
The complexity of setting a column is O(1).
The complexity of inserting a new column is O(n), but the complexity of inserting a new column to the end of columns is O(1).
The complexity of removing a column is O(n), but the complexity of removing a column from the end of columns is O(1).
Data Frame Operations
See R::OP::DataFrame about data frame operations.
Fields
colobjs_list;
has colobjs_list : List of R::DataFrame::Column;
A list of R::DataFrame::Column objects that represents columns.
colobjs_indexes_h
has colobjs_indexes_h : Hash of Int;
A hash that stores the column index for each column name.
Class Methods
new
static method new : R::DataFrame ();
Creates a new R::DataFrame object and returns it.
Instance Methods
colnames
method colnames : string[] ();
Returns column names.
exists_col
method exists_col : int ($colname : string);
If the colnum named $colname exists, returns 1, otherwise returns 0.
colname
method colname : string ($col : int);
Returns a column name at column index $col.
Exceptions:
If the column at index $col dose not exist, an exception is thrown.
colindex
method colindex : int ($colname : string);
Returns the column index of the column named $colname.
Exceptions:
If the column named $colname does not exists, an exception is thrown.
col_by_index
method col_by_index : R::NDArray ($col : int);
Returns the n-dimensional array at column index $col.
Exceptions:
If the column at index $col does not exists, an exception is thrown.
first_col
method first_col : R::NDArray ();
Returns the n-dimensional array of the first column.
This method calls "col" method.
Exceptions:
Exceptions thrown by "col" method could be thrown.
col
method col : R::NDArray ($colname : string);
Returns the n-dimensional array of the column named $colname.
Excepttions:
If the column named $colname dose not exist, an exception is thrown.
set_col
method set_col : void ($colname : string, $ndarray : R::NDArray);
Sets the n-dimensional array of the column named $colname to the n-dimensional array $ndarray.
$ndarray becomes read-only by calling R::NDArray#make_dim_read_only method.
If the n-dimensional array of the column named $colname exists, it is replaced with $ndarray.
If not, a new column is inserted by "insert_col" method.
insert_col
method insert_col : void ($colname : string, $ndarray : R::NDArray, $before_colname : string = undef);
Inserts the column named $colname with the n-dimensional array $ndarray before the column named $before_colname.
If $before_colname is not defined, the new column is instered at the end of columns.
The column name $colname must be a non-empty string. Otherwise, an exception is thrown.
The n-dimensional array $ndarray must be defined. Otherwise, an exception is thrown.
If the column named $colname already exists, an exception is thrown.
The dimensions of the n-dimensional array $ndarray must be equal to the dimensions of the n-dimensional array of the first column of this data frame. Otherwise, an exception is thrown.
ncol
method ncol : int ();
Returns the column numbers.
nrow
method nrow : int ();
Returns the row numbers.
If columns do not exist, returns 0.
Exceptions:
The n-dimensional array of the first column of this data frame must be a vector.
remove_col
method remove_col : void ($colname : string);
Removes the column named $colname.
This method calls "colindex" method.
Exceptions:
Exceptions thrown by "colindex" method could be thrown.
clone
method clone : R::DataFrame ($shallow : int = 0);
Clones this data frame and returns it.
This method calls R::NDArray#clone method for the n-dimensional array of each column in this data frame.
to_string
method to_string : string ();
Strigifies this data frame and returns it.
slice
method slice : R::DataFrame ($colnames : string[], $axis_indexes_product : R::NDArray::Int[]);
Slices this data frame given the column names $colnames and the product of axis indexes $axis_indexes_product, and returnd sliced data frame.
This method calls R::NDArray#slice method for the n-dimensional array of each column in this data frame.
set_order
method set_order : void ($data_indexes_ndarray : R::NDArray::Int);
Calls R::NDArray#set_order method for the n-dimeniaonal array of each column in this data frame.
sort
method sort : void ($colnames_with_sort_order : string[]);
Sort data in the n-dimensional array in this data frame given the column name with the sort order $colnames_with_sort_order.
Implementation:
This method calls "order" method given $colnames_with_sort_order and calls "set_order" method givne the return value of "order" method.
order
method order : R::NDArray::Int ($colnames_with_sort_order : string[]);
Gets order data given the column names with the sort orders $colnames_with_sort_order, and returns the order data.
The returned order data can be used for the argument of "set_order" method.
Format of a Column Name with the Sort Order:
COLUMN_NAME
COLUMN_NAME SORT_ORDER
COLUMN_NAME
is a column name.
SORT_ORDER
is asc
or desc
.
Examples are
age
age asc
age desc
Exceptions:
The column names $colnames_with_sort_order must be defined. Otherwise an exception is thrown.
The column numbers of this data frame must be greater than 0. Otherwise an exception is thrown.
If the column named $colname does not exist, an excetpion is thrown.
($colname is the column part of $colnames_with_sort_order.)
If the column name with the sort order is invalid format, an exception is thrown.
See Also
Copyright & License
Copyright (c) 2024 Yuki Kimoto
MIT License