NAME

AI::XGBoost::Booster - XGBoost main class for training, prediction and evaluation

VERSION

version 0.1

SYNOPSIS

use 5.010;
use aliased 'AI::XGBoost::DMatrix';
use AI::XGBoost qw(train);

# We are going to solve a binary classification problem:
#  Mushroom poisonous or not

my $train_data = DMatrix->From(file => 'agaricus.txt.train');
my $test_data = DMatrix->From(file => 'agaricus.txt.test');

# With XGBoost we can solve this problem using 'gbtree' booster
#  and as loss function a logistic regression 'binary:logistic'
#  (Gradient Boosting Regression Tree)
# XGBoost Tree Booster has a lot of parameters that we can tune
# (https://github.com/dmlc/xgboost/blob/master/doc/parameter.md)

my $booster = train(data => $train_data, number_of_rounds => 10, params => {
        objective => 'binary:logistic',
        eta => 1.0,
        max_depth => 2,
        silent => 1
    });

# For binay classification predictions are probability confidence scores in [0, 1]
#  indicating that the label is positive (1 in the first column of agaricus.txt.test)
my $predictions = $booster->predict(data => $test_data);

say join "\n", @$predictions[0 .. 10];

DESCRIPTION

Booster objects control training, prediction and evaluation

Work In Progress, the API may change. Comments and suggestions are welcome!

METHODS

update

Update one iteration

Parameters

iteration

Current iteration number

dtrain

Training data (AI::XGBoost::DMatrix)

boost

Boost one iteration using your own gradient

Parameters

dtrain

Training data (AI::XGBoost::DMatrix)

grad

Gradient of your objective function (Reference to an array)

hess

Hessian of your objective function, that is, second order gradient (Reference to an array)

predict

Predict data using the trained model

Parameters

data

Data to predict

set_param

Set booster parameter

Example

$booster->set_param('objective', 'binary:logistic');

set_attr

Set a string attribute

get_attr

Get a string attribute

get_score

Get importance of each feature

Parameters

importance_type

Type of importance. Valid values:

weight

Number of times a feature is used to split the data across all trees

gain

Average gain of the feature when it is used in trees

cover

Average coverage of the feature when it is used in trees

fmap

Name of feature map file

get_dump

attributes

Returns all attributes of the booster as a HASHREF

TO_JSON

Serialize the booster to JSON.

This method is to be used with the option convert_blessed from JSON. (See https://metacpan.org/pod/JSON#OBJECT-SERIALISATION)

Warning: this API is subject to changes

BUILD

Use new, this method is just an internal helper

DEMOLISH

Internal destructor. This method is called automatically

AUTHOR

Pablo Rodríguez González <pablo.rodriguez.gonzalez@gmail.com>

COPYRIGHT AND LICENSE

Copyright (c) 2017 by Pablo Rodríguez González.