Optuna: Get the Best out of your Hyperparameters – Easy Tutorial

In this article, we see what Optuna is, the library that enables you to optimize your Machine Learning Models in a blink of an eye.

Optuna is a library that allows the automatic optimization of the hyperparameters of your Machine Learning models.

It allows you to easily identify the optimal hyperparameters by performing several tests with different combinations of hyperparameters.

Hyperparameters play a crucial role in :

  • the final predictions of your model
  • its ability to adapt
  • its ability to generalize

That’s why Optuna, when used properly, can considerably increase the performance of your models.

Let’s get into the details! 🧐

Optuna – How does it work?

The basic idea

The idea behind Optuna is simple: provide a space of hyperparameters to test in order to determine the combination of hyperparameters that optimize your model.

First, you need to give Optuna a performance metric.

Its objective will be to optimize it.

For example, if you give it the loss, its goal will be to minimize it so that it comes as close to 0 as possible.

Then you have to give it a search space for the hyperparameters you want to investigate.

For example, if you want to test the number of hidden layers, you can tell it to test a range from 1 to 10 hidden layers.

The module will test each of the possibilities to determine the best number of layers according to the previously specified performance metric, loss.

You can also specify several parameters at once.

For example, the number of hidden layers AND the learning rate.

Optuna will then launch a series of tests, at the end of which, it will give you the hyperparameters values that optimize your metric.

Note that you can specify several metrics to optimize at the same time, such as loss AND accuracy.

Optuna’s advantages

Here are the main advantages of the Optuna library:

  • Ease of Use: it has a simple API that allows users to define the metric to be optimized and the hyperparameter space to be investigated. It takes only one function call to execute the optimization process.
  • Scalability: it is designed to adapt to large-scale optimization problems, thanks to the support of test parallelization and pruning of unsuccessful tests.
  • Flexibility: it supports a wide range of optimization algorithms, including Random Search, Grid Search, and Bayesian optimization.
  • Adaptability – Framework: it supports the optimization of Machine Learning models implemented in multiple frameworks: PyTorch, TensorFlow and scikit-learn.
  • Adaptability – Hyperparameters: it supports optimization of continuous, integer and categorical hyperparameters, as well as hyperparameters with complex dependencies (the optimal value of a hyperparameter may depend on the values of other hyperparameters).

Let’s move on to my favorite part, the practice! ☄️

How to use Optuna in Python

To use Optuna, you will first have to install it with the pip command:

!pip install optuna

Then, we will load some data.

Our data

Any data will do the job, the objective of this article is not to carry out a project but to see how to use Optuna.

Here we use the Keras dataset “Reuters”. The goal is to classify each news in one or more categories according to its content.

This is a classic NLP problem, multi-class.

Let’s load the dataset:

import tensorflow as tf

(X_train, y_train), (X_test, y_test) = tf.keras.datasets.reuters.load_data(path="reuters.npz")

The news are already tokenized, we just have to normalize each of the news:

X_train = tf.keras.utils.pad_sequences(X_train, maxlen=max_len)
X_test = tf.keras.utils.pad_sequences(X_test, maxlen=max_len)

Here we focus on Optuna but if you want to know more about NLP preprocessing, we go into detail in this article.

Now it’s going to be more tricky!

We will create a function defining two things :

  • the hyperparameters to optimize
  • the creation of the model thanks to these hyperparameters


To define the space of a hyperparameter to investigate we use one of these functions depending on the type of our variables:

  1. suggest_categorical – Suggest a value for the categorical parameter.
  2. suggest_discrete_uniform – Suggest a value for the discrete parameter.
  3. suggest_float -Suggest a value for the floating point parameter.
  4. suggest_int – Suggest a value for the integer parameter.
  5. suggest_loguniform – Suggest a value for the continuous parameter.
  6. suggest_uniform – Suggest a value for the continuous parameter.

For example for the number of hidden layers, we will have :

Do not execute this line, it is an example.


Get your 7 DAYS FREE TRAINING to learn how to create your first ARTIFICIAL INTELLIGENCE!

For the next 7 days I will show you how to use Neural Networks.

You will learn what Deep Learning is with concrete examples that will stick in your head.

BEWARE, this email series is not for everyone. If you are the kind of person who likes theoretical and academic courses, you can skip it.

But if you want to learn the PANE method to do Deep Learning, click here :

n_hidden = trial.suggest_int('n_hidden', 1, 3)

Here we test 1, 2 and 3 hidden layers to determine the number that optimizes our metric.

Then we have to adapt the creation of the model.

For example, since we have potentially N (1, 2 or 3) hidden layers, we must create a loop that takes into account the fact that N can vary:

for i in range(n_hidden):
    model.add(Dense(50, activation='relu'))

Here is the complete code of the model creation which takes into account 3 hyperparameters:

  • Between 1 and 3 hidden layers
  • Between 32 and 128 neurons per layer
  • A learning rate between 0.00001 and 0.1
def create_model(trial):
    # Some hyperparameters we want to optimize
    n_hidden = trial.suggest_int('n_hidden', 1, 3)
    n_units = trial.suggest_int('n_units', 32, 128)
    learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-1)

    model = Sequential()
    model.add(Dense(n_units, input_dim=X_train.shape[1], activation='relu'))
    for i in range(n_hidden):
        model.add(Dense(n_units, activation='relu'))
    model.add(Dense(y_train[0].size, activation='softmax'))
    return model

objective function

Then, we can create the objective() function.

Inside, we use the function we have just created: create_model().

Then we train the model.

We extract the metric we want to optimize.

In our case we choose the loss that we place in the score variable.

Optuna will optimize what this function returns (the score variable):

def objective(trial):
    model = create_model(trial)
    model.fit(X_train, y_train,
    score = model.evaluate(X_test, y_test, verbose=0)[1]
    return score

We create a Study object that will allow us to store the results of the Optuna investigation:

import optuna

study = optuna.create_study()

And finally we launch the investigation on 100 trials with the optimize() function:

study.optimize(objective, n_trials=100, n_jobs=-1)

Output : Trial 0 finished with value: 0.0467497780919075 and parameters: {‘n_hidden’: 2, ‘n_units’: 119, ‘learning_rate’: 0.005641381456976518}. Best is trial 1 with value: 0.0467497780919075.

Optuna will test 100 different combinations of our hyperparameters, specified in the create_model() function.


At the end of the investigation, we can display the best combination of hyperparameters:


Output : {‘n_hidden’: 3, ‘n_units’: 121, ‘learning_rate’: 0.019243253125586307}

For our model, Optuna found that the best combination is :

  • 3 hidden layers
  • 121 neurons per layer
  • A learning rate of 0.019

We can also display in graph form the result of each of Optuna’s trials:

from optuna.visualization import plot_optimization_history


We can use the optimal combination found by Optuna to create a model:

best_model = create_model(study.best_trial)

And use it on our test data:

best_model.evaluate(X_test, y_test)[1]

Output: 0.046

We get a loss of 0.046, not bad!

What about you ? What score did you get?

However, keep in mind that the Optuna bookshop does not give you THE best model.

It gives you the best model it has found based on :

  • its tests – Optuna does not test all the possibilities but only a limited number that you indicate(n_trials)
  • the space of hyperparameters you gave – Optuna doesn’t investigate all the hyperparameters but only the ones you give it

See you soon on Inside Machine Learning 😉


Get your 7 DAYS FREE TRAINING to learn how to create your first ARTIFICIAL INTELLIGENCE!

For the next 7 days I will show you how to use Neural Networks.

You will learn what Deep Learning is with concrete examples that will stick in your head.

BEWARE, this email series is not for everyone. If you are the kind of person who likes theoretical and academic courses, you can skip it.

But if you want to learn the PANE method to do Deep Learning, click here :

Tom Keldenich
Tom Keldenich

Data Engineer & passionate about Artificial Intelligence !

Founder of the website Inside Machine Learning

Leave a Reply

Your email address will not be published. Required fields are marked *

Enter your email to receive for free

The PANE method for Deep Learning

* indicates required


You will receive one email per day for 7 days – then you will receive my newsletter.
Your information will never be given to third parties.

You can unsubscribe in 1 click from any of my emails.

Entre ton email pour recevoir gratuitement
la méthode PARÉ pour faire du Deep Learning

Tu recevras un email par jour pendant 7 jours - puis tu recevras ma newsletter.
Tes informations ne seront jamais cédées à des tiers.

Tu peux te désinscrire en 1 clic depuis n'importe lequel de mes emails.