This is a guide on how to train a new language from scratch with Transformers. this scripts use Oscar Corpus as the dataset and a MLM task model is trained for Farsi Language