Python Package Introduction#
This document gives a basic walkthrough of the Secure XGBoost python package. There’s also a sample Jupyter notebook at demo/python/jupyter/e2e-demo.ipynb
.
List of other Helpful Links
Install Secure XGBoost#
To install Secure XGBoost, follow instructions in Installation Guide.
To verify your installation, run the following in Python:
import securexgboost as xgb
Data Interface#
The Secure XGBoost python module is able to load data from:
LibSVM text format file
Comma-separated values (CSV) file
The data is stored in a DMatrix
object.
To load a libsvm text file or a Secure XGBoost binary file into
DMatrix
:dtrain = xgb.DMatrix('train.svm.txt') dtest = xgb.DMatrix('test.svm.buffer')
To load a CSV file into
DMatrix
:# label_column specifies the index of the column containing the true label dtrain = xgb.DMatrix('train.csv?format=csv&label_column=0') dtest = xgb.DMatrix('test.csv?format=csv&label_column=0')
Note
Secure XGBoost does not support categorical features.
Setting Parameters#
Secure XGBoost can use either a list of pairs or a dictionary to set parameters. For instance:
Booster parameters
param = {'max_depth': 2, 'eta': 1, 'silent': 1, 'objective': 'binary:logistic'} param['nthread'] = 4 param['eval_metric'] = 'auc'
Training#
Training a model requires a parameter list and data set.
num_round = 10
bst = xgb.train(param, dtrain, num_round, evallist)
Methods including update
and boost
from securexgboost.Booster
are designed for
internal usage only. The wrapper function securexgboost.train
does some
pre-configuration including setting up caches and some other parameters.
Prediction#
A model that has been trained or loaded can perform predictions on data sets.
dtest = xgb.DMatrix('test.svm.txt')
ypred = bst.predict(dtest)