The best method to extract the last juice out of your deep learning or machine learning models is to select the correct hyperparameters. With the right choice, you can tailor the behavior of the algorithm to your particular dataset. It’s important to note that hyperparameters are different from parameters. The model estimates the parameters from the given data, for instance, the weights of a DNN (deep neural network). But the model can’t estimate hyperparameters from the given data. Rather, the practitioner specifies the hyperparameters when configuring the model, such as the learning rate of a DNN (deep neural network).
Usually, knowing what values you should use for the hyperparameters of a specific algorithm on a given dataset is challenging. That’s why you need to explore various strategies to tune hyperparameter values.
With hyperparameter tuning, you can determine the right mix of hyperparameters that would maximize the performance of your model.
The two best strategies in use for hyperparameter tuning are:
It involves creating a grid of probable values for hyperparameters. Every iteration tries a set of hyperparameters in a particular order from the grid of probable hyperparameter values. The GridSearch strategy will build several versions of the model with all probable combinations of hyperparameters, and return the one with the best performance.
Since GridSearch goes through all the intermediate sets of hyperparameters, it’s an extremely expensive strategy computationally.
It also involves building a grid of probable values for hyperparameters but here, every iteration tries a random set of hyperparameters from the grid, documents the performance, and finally, returns the set of hyperparameters that provided the best performance.
As RandomizedSearch moves through a fixed number of hyperparameter settings, it decreases unnecessary computations and the associated costs, thus offering a solution to overcome the drawbacks of GridSearch.
Selecting the hyperparameters to tune
The more hyperparameters of an algorithm you want to tune, the slower would be the tuning process. This makes it important to choose a minimum subset of hyperparameters to search or tune. But not all hyperparameters are equally important. Also, you’ll find little universal advice on how to select the hyperparameters that you should tune.
Having experience with the machine learning technique you’re using could give you useful insights into the behavior of its hyperparameters, which could make your choice a bit easier. You may even turn to machine learning communities to seek advice. But whatever your choice is, you should realize the implications.
Each hyperparameter that you select to tune will have the possibility of increasing the number of trials necessary for completing the tuning task successfully. And when you use an AI Platform Training to train your model, you’ll be charged for the task’s duration, which means choosing the hyperparameters to tune carefully would decrease both the time and training cost of your model.
For a good start with hyperparameter tuning models, you can go with scikit-learn though there are better options too for hyperparameter tuning and optimization, such as Hyperopt, Optuna, Scikit-Optimize, and Ray-Tune, to name a few.
. . .
To learn more about variance and bias, click here and read our another article.