Optimize LightGBM with Optuna – How to do now ?

If you are in the middle of a ML competition, or simply in your day-to-day work, you can use Optuna to optimize your LightGBM model.

I believe LightGBM is one of the best Machine Learning libraries at the moment.

It has set many records in ML competitions.

If you don’t know this library yet, I recommend you to read our article on the topic before diving in this one.

It’s a good tutorial to start with.

Now if you are here… you probably want to go further.

Maybe your model didn’t reach the performance you wanted.

Or perhaps it exceeded all your expectations.

That a little voice in your head is asking “How far my model can go?

Well, you can try to tune the hyperparameters yourself.

Problem? It might take you hours.

And in the end, you don’t even know if it will improve your model.

A better solution exists:

Optuna

Optuna est une librairie d’optimisation automatique de modèle de Machine Learning.

Soyons un peu plus précis.

Ce n’est pas vraiment automatique.

La librairie a besoin d’input de ta part pour optimiser ton modèle.

Voilà le principe : tu donne à Optuna un espace de recherche. Elle s’occupe de faire des tests sur ton modèle.

Par exemple tu veux explorer l’hyperparamètre learning_rate.

Dans ce cas tu lui donne un espace de recherche: “Optuna fait des tests sur le learning rate, en lui donnant des valeurs entre 0.0001 et 0.1”.

Optuna prend ta requête et fait des tests.

Tu peux même lui demander d’explorer plusieurs hyperparamètres à la fois.

Si tu veux avoir un guide complet sur Optuna et des explications détaillées, c’est par ici.

Optuna is an automatic Machine Learning model optimization library.
Let’s be a little more precise.

Actually, it is not really automatic.

The library needs input from you to optimize your model.

Here is the principle: you give Optuna a search space. It takes care of testing your model.

For example you want to explore the learning_rate hyperparameter.

In this case you give it a search space: “Optuna tests the learning rate, giving it values between 0.0001 and 0.1”.

Optuna takes your query and runs tests.

You can even ask it to explore several hyperparameters at once.

If you want to have a complete guide on Optuna and detailed explanations follow this link.

Optimizing LightGBM with Optuna

It is very easy to use Optuna.

Especially with the basic libraries: scikit-learn, Keras, PyTorch.

But when you want to use more technical libraries, it is obviously more complex.

Let’s consider that you already have your data: X_train, X_val, X_test, y_train, y_val, y_test.

First of all, I invite you to install the two libraries that interest us:

!pip install lightgbm
!pip install optuna

Then import LGBM and load your data in LGBM Datasets (This is how the library will be able to interpret them):

By the way, if your goal is to master Deep Learning - I've prepared the Action plan to Master Neural networks. for you.

7 days of free advice from an Artificial Intelligence engineer to learn how to master neural networks from scratch:

  • Plan your training
  • Structure your projects
  • Develop your Artificial Intelligence algorithms

I have based this program on scientific facts, on approaches proven by researchers, but also on my own techniques, which I have devised as I have gained experience in the field of Deep Learning.

To access it, click here :

GET MY ACTION PLAN

GET MY ACTION PLAN

Now we can get back to what I was talking about earlier.

import lightgbm as lgb

lgb_train = lgb.Dataset(X_train, y_train)
lgb_val = lgb.Dataset(X_val, y_val, reference=lgb_train)

Now we have to create a function.

This function is the objective that Optuna will optimize.

Here the code will DEPEND ON YOUR OBJECTIVE.

If you want to maximize the precision you should do the following:

def objective(trial):
    # Define hyperparameters
    params = {
        'objective': 'binary',
        'learning_rate': trial.suggest_loguniform('learning_rate', 1e-5, 1e-2),
        'num_leaves': trial.suggest_int('num_leaves', 2, 128),
        'scale_pos_weight': trial.suggest_int('scale_pos_weight', 1, 10),
        'metric': 'accuracy'  # use accuracy as the evaluation metric
    }
    
    # Train model
    model = lgb.train(params, lgb_train, valid_sets=lgb_val, early_stopping_rounds=10)
    
    # Return accuracy on validation set
    return model.best_score['valid_0']['accuracy']

If, on the other hand, you want to minimize a loss, you should use this code:

def objective(trial):
    # Define hyperparameters
    params = {
        'objective': 'binary',
        'learning_rate': trial.suggest_loguniform('learning_rate', 1e-5, 1e-2),
        'num_leaves': trial.suggest_int('num_leaves', 2, 128),
        'scale_pos_weight': trial.suggest_int('scale_pos_weight', 1, 10)
    }
    
    # Train model
    model = lgb.train(params, lgb_train, valid_sets=lgb_val, early_stopping_rounds=10)
    
    # Return loss on validation set
    return model.best_score['valid_0']['binary_logloss']

It is in the params variable that you indicate the hyperparameters you want to optimize.

You can find the list of hyperparameters for the LigthGBM models on the official documentation.

A last crucial step is to initialize Optuna.

At this point you have to indicate if you want to minimize or maximize.

If you want to optimize the precision choose maximization:

import optuna

study = optuna.create_study(direction='maximize')

Otherwise choose the minimization :

import optuna

study = optuna.create_study(direction='minimize')

Now you just have to launch the LightGBM optimization with Optuna.

Here we give the objective function and the number of tests to perform:

study.optimize(objective, n_trials=100)

The optimization can take time.

Once it is finished, I invite you to retrieve the best hyperparameters found by Optuna:

best_params = study.best_params

Then you can build a model from these parameters:

model = lgb.train(best_params, lgb_train, valid_sets=lgb_val)

And use your model on new data:

y_pred = model.predict(X_test, num_iteration=model.best_iteration)

That’s all for this article.

Good luck with your optimization!🔥

And keep in mind that other methods exist if you want to improve your models:

See you soon on Inside Machine Learning 😉

One last word, if you want to go further and learn about Deep Learning - I've prepared for you the Action plan to Master Neural networks. for you.

7 days of free advice from an Artificial Intelligence engineer to learn how to master neural networks from scratch:

  • Plan your training
  • Structure your projects
  • Develop your Artificial Intelligence algorithms

I have based this program on scientific facts, on approaches proven by researchers, but also on my own techniques, which I have devised as I have gained experience in the field of Deep Learning.

To access it, click here :

GET MY ACTION PLAN

GET MY ACTION PLAN

Tom Keldenich
Tom Keldenich

Artificial Intelligence engineer and data enthusiast!

Founder of the website Inside Machine Learning

Leave a Reply

Your email address will not be published. Required fields are marked *

This page will not stay online forever

Enter your email to receive for free

The PANE method for Deep Learning

* indicates required

 

You will receive one email per day for 7 days – then you will receive my newsletter.
Your information will never be given to third parties.

You can unsubscribe in 1 click from any of my emails.



Entre ton email pour recevoir gratuitement
la méthode PARÉ pour faire du Deep Learning


Tu recevras un email par jour pendant 7 jours - puis tu recevras ma newsletter.
Tes informations ne seront jamais cédées à des tiers.

Tu peux te désinscrire en 1 clic depuis n'importe lequel de mes emails.