https://web.archive.org/web/20220527180306/https://kylrth.com/paper/gradient-based-hyperparameter-optimization/