A code model trained on permissively licensed repositories.
A large corpus aims to enable cleaner code-model training.