Talk: Hyperparameter Optimization for MLPerf Training with SigOpt
4:30 - 5pm
Large scale deep learning training workload runtime optimization is computationally expensive, requires contributions from a multidisciplinary team and relies on complex hyperparameter optimization (HPO) techniques. Habana Labs developed a computationally cost efficient methodology to reduce the MLPerf training workload’s runtime, namely the reduction of the number of training epochs required to reach target accuracy.
The Habana Labs team utilized HPO to optimize runtime and tested two different methods in the process: home-grown Grid Search and the SigOpt optimization library that is built with an ensemble of Bayesian and other global optimization algorithms. When comparing these two optimization methods, SigOpt provided a clear advantage over home-grown Grid Search in three ways. First, SigOpt required fewer observations to reach the same threshold accuracy. Second, it reached the same threshold accuracy with an even lower number of training epochs when compared to home-grown Grid Search. And, finally, it also provided insight into the potential relations among the hyperparameters in a Dashboard. In this talk, the Habana Labs team will share their experience working through these challenging workloads and the insights they gained in the process.