In this tutorial, you are going to learn
1. What is import the LightGBM libraries?
2. How to download the dataset?
3. How to explore the dataset?
4. How to split the dataset into training and testing?
5. How to set parameters for the LightGBM model?
6. How to create a dataset for the LightGBM model?
7. how to implement a LightGBM model for multi-class classification?
8. How to calculate the classification report for the trained model?
9. How to save a trained LightGBM model?
10. How to find the current iteration of the model?
11. How to retrain a LightGBM model?
12. How to find feature importance in the LightGBM model?
1. Import the Libraries
import lightgbm as lgb import numpy as np import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.datasets import load_digits from sklearn.metrics import classification_report
2. Download Dataset
We are going to download the Digit dataset from Scikti-Learn datasets.
digits = load_digits()
3. Explore Dataset
We are going to plot some of the images in the training dataset.
plt.figure(figsize=(4,4)) for value, (images, labels) in enumerate(zip(digits.data[0:5], digits.target[0:5])): plt.subplot(1, 5, value + 1) plt.imshow(np.reshape(images, (8,8))) plt.title('%i\n' % labels)

4. Splitting Data
Once the data is normalized then we need to split the data into training and testing dataset.
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.01, random_state=0) X_train,X_train_2, y_train, X_test_2 = train_test_split(digits.data, digits.target, test_size=0.5, random_state=0)
5. Set Parameters for LightGBM model
Parameters can be set for the LightGBM model. We are specifying the following parameters
1. “learning_rate” : To specify the learning rate in the LightGBM model.
2. “objective” : To set binary or multi-class classification in the LightGBM model.
3. “metric” : To specify loss metric.
4 “max_depth“ : To specify the maximum depth of a tree in the LightGBM model.
5. “num_class“ : To specify the number of classes in the dataset.
params={} params['learning_rate']=0.04 params['boosting_type']='gbdt' params['objective']='multiclass' params['metric']='multi_logloss' params['max_depth']=10 params['num_class']=10
6. Create LightGBM Dataset
We can create a lightgbm dataset by using the Dataset( ) method. The parameters for this method are
1. The features columns.
2. The label column.
training_dataset=lgb.Dataset(X_train, label=y_train) testing_dataset=lgb.Dataset(X_test, label=y_test) retrain_dataset=lgb.Dataset(X_train_2, label=X_test_2)
7. Model Training and Prediction
We are going to train our LIghGBM model with a customized dataset. We need to specify the number of rounds in the parameter.
classifier=lgb.train(params,training_dataset,100) y_predictions=classifier.predict(X_test) y_predictions[:5]

8. Rounding Predictions
The model predictions will be in probabilities. We need to find out the label using the argmax method. Argmax( ) method will return the index with maximum probability.
y_predictions_2 = [np.argmax(value) for value in y_predictions] y_predictions_2[:10]

9. Classification Report
print(classification_report(y_predictions_2,y_test))

10. Save LightGBM Model
The trained model can be saved using the save_model( ) method.
classifier.save_model('lightgbclassifier.txt')

11. Model Current Iteration
To find out the number of iterations on which our model has been trained we can use the current_iteration( ) method.
print("Option 1 current iter# %d" %model.current_iteration())

12. Retrain Model
To retrain a model, we need to load the saved model. Once the saved model is loaded, we will create a new model instance.
“init_model” parameter is used to specify to load the existing model into the new model instance. In this was the new model instance will have the old iterations history.
claasifier_retrain = lgb.train(params, retrain_dataset, num_boost_round = 100, init_model=model)
print("Retrained Model Iteration# %d" %claasifier_retrain.current_iteration())

14. Feature Importance
We can find out the feature importance in the model using the plot_importance( ) method.
fig, ax = plt.subplots(figsize=(20, 10)) lgb.plot_importance(claasifier_retrain,ax=ax)

Summary
1. wget : To download the data.
2. train_test_split( ) : To split the dataset into training and testing.
3. pairplot( ) : To visualize the data distribution the dataset.
5. “learning_rate” : Parameter to specify learning rate in LightGBM model.
6. “objective” : Parameter to set binary or multi-class classification in LightGBM model.
7. “metric” : Parameter to specify loss metric.
8 “max_depth“ : Parameter to specify maximum depth of a tree in LightGBM model.
9. “num_class“ : Parameter to specify the number of classes in the dataset..
10. Dataset( ) : To create LightGBM dataset,
11. train( ) : To train a LIghtGBM model.
12. predict( ) : To predict values from a trained model.
13. classification_report( ) : To find the classification report of a trained model.
14. save_model( ) : To save a trained model.
15. current_iteration( ) : To find the current iteration of the model.
16. plot_importance( ) : To plot feature importance in LightGBM model.