AI::XGBoost - Perl wrapper for XGBoost library
use 5.010;
use aliased 'AI::XGBoost::DMatrix';
use AI::XGBoost qw(train);
# We are going to solve a binary classification problem:
# Mushroom poisonous or not
my $train_data = DMatrix->From(file => 'agaricus.txt.train');
my $test_data = DMatrix->From(file => 'agaricus.txt.test');
# With XGBoost we can solve this problem using 'gbtree' booster
# and as loss function a logistic regression 'binary:logistic'
# (Gradient Boosting Regression Tree)
# XGBoost Tree Booster has a lot of parameters that we can tune
# (
my $booster = train(data => $train_data, number_of_rounds => 10, params => {
objective => 'binary:logistic',
eta => 1.0,
max_depth => 2,
silent => 1
# For binay classification predictions are probability confidence scores in [0, 1]
# indicating that the label is positive (1 in the first column of agaricus.txt.test)
my $predictions = $booster->predict(data => $test_data);
say join "\n", @$predictions[0 .. 10];
use aliased 'AI::XGBoost::DMatrix';
use AI::XGBoost qw(train);
use Data::Dataset::Classic::Iris;
# We are going to solve a multiple classification problem:
# determining plant species using a set of flower's measures
# XGBoost uses number for "class" so we are going to codify classes
my %class = (
setosa => 0,
versicolor => 1,
virginica => 2
my $iris = Data::Dataset::Classic::Iris::get();
# Split train and test, label and features
my $train_dataset = [map {$iris->{$_}} grep {$_ ne 'species'} keys %$iris];
my $test_dataset = [map {$iris->{$_}} grep {$_ ne 'species'} keys %$iris];
my $train_label = [map {$class{$_}} @{$iris->{'species'}}];
my $test_label = [map {$class{$_}} @{$iris->{'species'}}];
my $train_data = DMatrix->From(matrix => $train_dataset, label => $train_label);
my $test_data = DMatrix->From(matrix => $test_dataset, label => $test_label);
# Multiclass problems need a diferent objective function and the number
# of classes, in this case we are using 'multi:softprob' and
# num_class => 3
my $booster = train(data => $train_data, number_of_rounds => 20, params => {
max_depth => 3,
eta => 0.3,
silent => 1,
objective => 'multi:softprob',
num_class => 3
my $predictions = $booster->predict(data => $test_data);
Perl wrapper for XGBoost library.
The easiest way to use the wrapper is using train
, but beforehand you need the data to be used contained in a DMatrix
This is a work in progress, feedback, comments, issues, suggestion and pull requests are welcome!!
Currently this module need the xgboost binary available in your system. I'm going to make an Alien module for xgboost but meanwhile you need to compile yourself xgboost:
Performs gradient boosting using the data and parameters passed
Returns a trained AI::XGBoost::Booster used
- params
Parameters for the booster object.
Full list available:
- data
AI::XGBoost::DMatrix object used for training
- number_of_rounds
Number of boosting iterations
The goal is to make a full wrapper for XGBoost.
- 0.1
Full raw C API available as AI::XGBoost::CAPI::RAW
- 0.2
Full C API "easy" to use, with PDL support as AI::XGBoost::CAPI
Easy means clients don't have to use FFI::Platypus or modules dealing with C structures
- 0.3
Object oriented API Moose based with DMatrix and Booster classes
- 0.4
Complete object oriented API
- 0.5
Use perl signatures (
