https://web.archive.org/web/20220527185757/https://towardsdatascience.com/byte-pair-encoding-the-dark-horse-of-modern-nlp-eb36c7df4f10?gi=72edd23a7474